Read TILSA SCASS Subcommittee on Formative Assessment text version

Interim Assessment Practices and Avenues for State Involvement TILSA SCASS Interim Assessment Subcommittee

August 12, 2008

THE COUNCIL OF CHIEF STATE SCHOOL OFFICERS The Council of Chief State School Officers (CCSSO) is a nonpartisan, nationwide, nonprofit organization of public officials who head departments of elementary and secondary education in the states, the District of Columbia, the Department of Defense Education Activity, and five U.S. extra-state jurisdictions. CCSSO provides leadership, advocacy, and technical assistance on major educational issues. The Council seeks member consensus on major educational issues and expresses their views to civic and professional organizations, federal agencies, Congress, and the public.

Technical Issues in Large Scale Assessment State Collaborative on Assessment and Student Standards (TILSA SCASS)

The Council's State Collaborative on Assessment and Student Standards strives to provide leadership, advocacy and service in creating and supporting effective collaborative partnerships through the collective experience and knowledge of state education personnel to develop and implement high standards and valid assessment systems that maximize educational achievement for all children.

COUNCIL OF CHIEF STATE SCHOOL OFFICERS Rick Melmer (South Dakota), President Elizabeth Burmaster (Wisconsin), Past President T. Kenneth James (Arkansas), President-Elect Gene Wilhoit, Executive Director John Tanner, Director Center for Innovative Measures Douglas Rindone, Coordinator, TILSA SCASS Duncan MacQuarrie, Assistant Coordinator, TILSA SCASS Eric Crane, WestEd Council of Chief State School Officers One Massachusetts Avenue, NW, Suite 700 Washington, DC 20001-1431 Phone (202) 336-7000 Fax (202) 408-8072 www.ccsso.org

Copyright © 2008 by the Council of Chief State School Officers, Washington, DC All rights reserved.

Interim Assessment Practices and Avenues for State Involvement TILSA SCASS Interim Assessment Subcommittee August 12, 2008 Introduction School districts are using a host of strategies to promote student learning and performance, with an eye toward ensuring that every student reaches proficiency by 2013­2014, as mandated under the No Child Left Behind Act (NCLB) of 2001. One growing practice is the use of more regular, formal monitoring of student performance, through ongoing assessment during the school year. These tests are called by several names, including diagnostic, periodic, predictive, interim, benchmark, and sometimes even formative assessments (and some clarification of terms is an important topic in itself, considered below). The assessments are often sold as commercial products, marketed to school districts by publishers of tests and other educational materials. They may also be homegrown by the school district or provided as one feature of a statemanaged system that can include a bank of test items available to the local teacher. I. Background Goals of the report Technical Issues in Large Scale Assessment (TILSA), a State Collaborative on Assessment and Student Standards (SCASS) of the Council of Chief State School Officers (CCSSO) has commissioned this study of interim assessments. Within TILSA, in discussions during 2006 and 2007, there was a sense that such a study was needed to better understand the assessments being used, their purposes, and their potential value. Representatives from member states identified elements of recent assessment practice that need additional clarity, including the: distinction between formative and interim assessment--often the label formative is used to describe interim assessments; uses and limitations of interim assessments--local educators need to have realistic expectations about what interim assessments can and cannot do; and importance of alignment of interim assessments and item banks to state or local standards and learning objectives--district consumers of commercial interim assessment products may sacrifice alignment for cost or may not view "homegrown" development of an aligned interim assessment as an option. TILSA's Interim Assessment subcommittee was formed, and it planned this study with the following goals in mind: disseminate information about this type of student assessment and how it can be used clarify and elaborate on the differences between interim and formative assessments describe examples of state educational agencies' (SEAs') involvement with interim assessments and identify state roles regarding these assessments highlight educational agencies or jurisdictions where this type of assessment appears to be having benefits

1

To help describe the range of recent practice in the use of these medium-scale, mediumcycle assessments, the present investigation employed a case-study approach. This report profiles seven jurisdictions 1 that administer or facilitate the use of what is termed interim assessments, with a particular lens for policies and practices that appear to be effective-- or, at least innovative--for state and local educational agencies. Interim assessments--defining what they are and are not A definition Perie, Marion, and Gong (2007) have defined interim assessment as follows: Assessments administered during instruction to evaluate students' knowledge and skills relative to a specific set of academic goals in order to inform policymaker or educator decisions at the classroom, school, or district level. The specific interim assessment designs are driven by the purpose and intended uses, but the results of any interim assessment must be reported in a manner allowing aggregation across students, occasions, or concepts. (p. 5) The major features of this definition are on target, including the focus on the purposes and intended uses of the assessment, as well as the requirement that interim assessment results be reported so that aggregation is possible. However, this definition can be strengthened in three areas. First, it is not clear that interim assessment occurs "during instruction," and in fact, it may be more common that the administration of interim assessments marks a break in the flow of instruction. Acknowledging a break in the instructional flow may also help to differentiate interim assessment from formative assessment, the latter being seamless with the flow of instruction. Second, in subcommittee discussions about interim assessment, the frequency of administration repeatedly arose as a defining attribute of these tests, and so including that aspect in the definition seems essential. Third, interim assessments can inform decisions about individual students and what lessons or courses they need next. Including the studentlevel in the definition is a helpful and appropriate modification. Building on the work of Perie, Marion, and Gong (2007), with those three refinements, we offer the following alternative definition of interim assessments: Assessments administered multiple times during a school year, usually outside of instruction, to evaluate students' knowledge and skills relative to a specific set of academic goals in order to inform policymaker or educator decisions at the student, classroom, school, or district level. The specific interim assessment designs are driven by the purposes and

1

Though local educational agencies (LEAs) or school districts are the most common unit for interim assessments, the range of case studies in this report includes districts, states, and even a country. Throughout this paper, the term "jurisdiction" is used to refer generically to the agency administering or supporting interim assessments. If a more specific term is used, then it is intended to refer to that particular kind of jurisdiction.

2

intended uses, but the results of any interim assessment must be reported in a manner allowing aggregation across students, occasions, or concepts. The components of this revised definition identify key features that characterize these assessments. The components of the definition In committee discussions about a definition of interim assessment, there was consensus that interim assessments fall between classroom-based and large-scale assessments, in frequency of administration as well as in scale. There was also consensus that the purposes or stakes of interim assessments typically fall between the goals of shaping learning and instruction (formative purposes) and measuring and documenting what has been learned (summative purposes). Defining interim assessments is complicated by the many legitimate ways that educators can and do use interim assessment information. Breaking down the parts of the definition brings interim assessment into sharper focus. Interim assessments represent a kind of hybrid, in that they may have formative as well as summative purposes. Generally, they do not have the quick turnaround that facilitates fast-responding changes to instruction; such is a defining characteristic of formative assessment. On the other hand, well implemented interim assessment does have the capacity to improve instruction, but in a less immediate fashion. In addition, interim assessments typically are not used to make high-stakes decisions about students and teachers. However, interim assessment information can contribute to accountabilityrelated decision-making. Interim assessment is a periodic practice, on the one hand, something done repeatedly each year and not as an annual event. On the other hand, it lacks the frequency and regularity of an ongoing process, distinguishing it from formative assessment. Typically, though perhaps not ideally, interim assessments interrupt the flow of classroom instruction and learning, as their administration is not a seamless activity with a classroom lesson. Furthermore, interim assessments in general do not provide immediate feedback; even those that can be scored most easily require at least till the next lesson before the teacher can use the results. Administration occurs at multiple times during the school year, especially to monitor whether students are on track to meeting content standards. Interim assessments are not given only once during the school year. They provide information both on individual and group progress, and are often used to judge whether individual students and classrooms of students are making progress toward instructional targets and whether particular units or lessons need to be retaught. In terms of scale and cycle, interim assessments fall between formative and summative assessments. In contrast to both formative assessment, where the scale is a single teacher, and summative assessment, which is often statewide, the scale of interim assessment is often the school district (or school building). Likewise, the interim assessment cycle, typically between two and six administrations per year, falls between the short cycle of formative assessment, which can have hundreds of instances during a school year, and the cycle of summative assessment, which is most commonly annual. As we will explore below, the purposes of interim assessments can be wide-ranging, and can include 3

formative and summative aims. However, the following chart provides a broad, generally applicable summary of where interim assessments fit along the spectrum from formative to summative assessment. The dotted lines are intended to suggest permeability between interim assessments and those labeled either formative or summative. Formative Typical Use feedback to adjust ongoing teaching and learning Frequency of continual; multiple Administration times a day Scope of student and classroom Administration Interim student progress monitoring Summative student placement; school and district accountability

generally 2-6 times per usually once a school year school year usually school or district (could be student) state

A note about terminology In studying and describing interim assessments, one objective (and challenge) is ensuring that educators with a range of assessment-related experience speak the same language. In this paper, the term "interim assessments" refers to the periodic, progress monitoring assessments that have become widespread in recent years. Other terms have been used for these tests, and this can be a source of confusion. A recent review of completed and inprocess studies carried out under the aegis of the regional educational laboratory (REL) program of the federal Institute of Education Sciences (IES) referred to interim assessments as used in this paper, twice as "benchmark assessments," once as "formative assessments," and once as "diagnostic/formative assessments" (IES, 2008). Indeed, as Perie, Marion, Gong, and Wurtzel (2007) observe: ... we believe that even assessment experts have been hobbled by the lack of clear definitions of assessment types and the abundant use of the term "formative assessment" to apply to assessments of very different design and purposes. This imprecision has led to a blurring of the differences between what we call "formative assessment" and what we call "interim assessment" (p. 2). Consistent with both Perie and her colleagues and the CCSSO SCASS on Formative Assessment for Students and Teachers (FAST), we agree with the definition that formative assessment is "a process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to improve students' achievement of intended instructional outcomes" (Perie, Marion, and Gong, 2007; Formative Assessment for Students and Teachers Collaborative, 2008). This paper does not focus on the formative assessment process, but rather on periodic tests used for monitoring student progress.

4

Another commonly used term, benchmark, is in our view ambiguous for two reasons. First, the term has been used to refer to tests focusing on particular academic content standards, which in some jurisdictions are called benchmarks. Second, prior to NCLB, when large-scale, high-stakes assessments at select grades ("benchmark grades") were in place in many states, such tests commonly were called "benchmark assessments." Time will tell whether the assessment field will sort out the terms, and the label interim has its own problems--think of a testing program that is in transition--but distinguishing these medium-cycle tests (interim assessments) from the ongoing process of fine-tuning instruction through determining what students know (formative assessment) is critical to building a shared understanding of the promise and challenges of testing. Administration of interim assessments Interim assessments differ from other types of assessment in their administration. Unlike the mandatory, statewide tests used for accountability, interim assessments may or may not be mandatory, and their adoption is generally not statewide. The relationship between interim assessments and the high-stakes statewide tests varies across the states, but it is rarely well defined. Interim assessments appear to support the state-administered accountability assessments in a general way, by providing information about student achievement during the school year. But the comparison is imperfect--while it is often claimed that the interim tests are aligned to the state standards, rigorous alignment work establishing such a connection is the exception rather than the rule (Brown and Coughlin, 2007). Certainly, the recent rise in the popularity of interim assessments coincides with NCLB accountability and with a trend toward increased testing. However, there is not evidence of a tight coupling of interim assessment practice and high-stakes statewide assessment. Interim assessments are most commonly administered district-wide. There are several reasons for this. In contrast to individual school buildings, school districts, in their role as fiscal agencies, can spend money on interim assessment products. Even in a time of tight resources for school districts, the popularity of commercially available interim assessments suggests strongly that school districts find them to be a worthwhile purchase. At the classroom level, it is generally not cost effective for teachers to create interim assessments from scratch. However, as we will explore below, there are teachers using item-banking and item-authoring tools to create interim tests administered at the classroom level. Classroom teachers in some school districts have the flexibility to administer district-level interim assessments at timing they choose. We could not find examples of statewide administration of interim assessments, though the state may facilitate interim testing, making available an item bank or requiring interim assessment for schools or districts in improvement. II. Uses and goals of interim assessments Even before conducting interviews with staff in case study jurisdictions, information on the range of uses and goals of interim assessments was available, primarily from three sources: Meetings of the Council of Chief State School Officers (CCSSO) collaborative on Technical Issues in Large-Scale Assessment (TILSA), specifically the Interim Assessment subcommittee. These meetings included discussions of interim 5

assessment practices. The TILSA members are aware of and knowledgeable about practices in their home states, and in most cases and depending on the topic, about innovation in other locales. Internet-based research. Some jurisdictions discuss their interim assessment program, including its goals and uses, on their website. Prior writings and publications. The available literature on interim assessments, though small, discusses how these tests are used and was a third source of information. In all, eight uses of interim assessment were identified: diagnosis, prediction, preparation, placement, student evaluation, school intervention, promotion/graduation, and local accountability. 2 As discussed earlier, the purposes of interim assessments can stretch from the borders of formative assessment to those of summative assessment. For example, interim assessments are used formatively when they are used for diagnosis. At the opposite boundary of interim assessments are those with more summative uses, such as promotion/graduation and local accountability. The remaining uses are suggestive of a more "middle ground" of purposes for these tests. The assignment of these uses along a continuum is intended as a rough guide; how tests are being used in real instances--the purposes to which the tests are actually put--should drive the thinking and any characterization about them. At the formative margin Where the goal is identification of weak areas of performance for subsequent remediation, the goal of the interim assessment is diagnosis. In the interim assessment "middle ground" Some jurisdictions (and many test publishers) claim that interim assessment provides a good forecast of student performance on the high-stakes test. In this case, the test is used for prediction. Where interim assessments are cited as helping students become familiar and comfortable with a test that may mirror the mandatory high-stakes test later in the school year, the jurisdiction's goals include preparation. Some jurisdictions use scores on interim tests for placement, to help inform what courses the student should be taking next. Perhaps the oldest use of interim assessment is for student evaluation, as traditional texts and curricula often have included chapter and unit tests that provide an evaluation of student understanding of recently completed material. Some jurisdictions require schools in need of improvement to use interim assessment. In this case, interim assessment serves as one initiative in a school intervention strategy. At the summative margin Where the progress shown across interim tests affects students' passage to the next grade, the tests are used for promotion. Likewise, state graduation requirements, such as the body-of-evidence-based approaches of Wyoming and

2

It should be noted that looking for validity evidence for these uses was beyond the scope of the study; however, it was a common viewpoint within the Interim Assessment subcommittee that interim assessments were often used for purposes for which there was little or no formal validity evidence.

6

Rhode Island, may include demonstrating learning over time through progress on district interim assessments. Interim assessment's purposes may be directed at the school as well as at students, as school districts may include interim assessment data as an input in a system of local accountability (Crane, Rabinowitz, & Zimmerman, 2004). III. Possible state actions and roles relating to these assessments While interim assessments are not entirely a new phenomenon, their sudden increase and unclear ties to high-stakes testing have left some states searching for a role vis-à-vis these tests. When questions about these tests bubble up from school district staff to staff at state educational agencies (SEAs), the state officials may be unprepared. However, a range of possible actions of SEAs regarding interim assessments can be described. The list that follows is based on internet research, interviews, and TILSA discussions of what is taking place currently, as well as possible future developments. The list is ordered from least state involvement and prescription to most. Of course, a state may take more than one action on this list or may transition over time between actions. Ignore interim assessments. Many SEAs do not take a position on interim assessments. The SEAs may view implementation of such tests as a district option and decision, with no state imperative to oversee or otherwise express an opinion or recommendation. Provide or support professional development on interim assessment. The SEA may include training on interim assessment as part of teacher training. States without an item bank or interim assessment system generally do not have a policy on the topic. It is common to find state training on interim assessment only in those states that manage an item bank or system. Disseminate nonbinding guidance or criteria for district selection. The SEA may issue a fact sheet or information guide on interim assessment, including suggested criteria for selection. Publish a consumer's guide. The SEA may prepare a nonbinding consumer's guide, reflecting review work performed or coordinated by SEA staff. The guide produced by New Mexico's SEA clearly stated the SEA's role: "The Public Education Department is not endorsing any of the vendors but rather provides a guide to school districts as they are reviewing vendor materials" (New Mexico Public Education Department, 2006, p. 3). Endorse a product. In their review of southwestern states' policies related to interim and formative assessment, Gallagher and Worth (2008) found that several states endorsed products or systems that support interim assessments or related data management needs. The Arkansas Department of Education endorses the Triand web-based system (Arkansas Department of Education, n.d.) for managing data associated with [interim] testing at the district level. The Oklahoma State Department of Education supports use of [an interim] assessment resource called Data Connections (Appalachia Education Laboratory and Edvantia, 2003). The Texas

7

Education Agency, through its Student Success Initiative, encourages districts to use Pearson's Progress Assessment Series (Pearson, 2007b). -- Gallagher and Worth, 2008, p. 11 Produce an approved list. Similar to the consumer's guide, but with more investigation as to particular products, the approved list option sets forth the products that districts may choose and still receive state funding. A district that chooses an instrument that is not on the list may not use certain state funds to purchase that product. In South Carolina, districts must select and administer interim assessments from the statewide adoption list to be eligible to receive certain state funds related to implementing district improvement plans. Facilitate consortia of districts. The SEA can act in an informal supporting role, bringing together--or supporting initiatives that bring together--school districts looking to purchase an interim assessment system or construct an item bank. Nebraska, through its School-based, Teacher-led Assessment and Reporting System (STARS), and Rhode Island, through its collaboration with the Rhode Island Skills Commission, have supported district consortia of item development. Build an item bank as an optional resource for school districts. Another role for the SEA is to build a bank of items and tests that local educators can access to generate their own interim assessments. Arizona, Georgia, Louisiana, Mississippi, Texas, and Utah all support an item bank that local educators can tap into to construct interim assessments. Require district and school use of interim assessment. Either through a state system or through local tests, the SEA may mandate the use of interim assessments. The state can require this of all schools or of high-priority schools/schools in improvement. Required use of interim assessment may be part of the state's accountability procedures, as these measures may be among multiple measures for schools and districts. South Carolina and Georgia are examples of states that require use of interim assessments as part of school and district improvement. IV. Assessing interim assessments: criteria from recent practice and the literature One of the motivating goals of this study is to inform what good interim assessment looks like. We posit that a well-designed interim assessment system will have tests and items that are technically sound, plus other qualities. But, what are those other qualities? Clearly, how to evaluate interim assessments depends on the particular purposes to which they are put. Although there is not a large research literature specific to interim assessment, past assessment policy and that literature provide a starting list of criteria for judging the merits of particular interim assessments. These criteria may assist SEAs and local school districts in evaluating these tests. Links to relevant documents are provided in this section, as well as in the reference list. Consumer's guides documents: the New Mexico example In our review of the literature, we found only one consumer's guide. When building its consumer's guide, New Mexico focused on short, straightforward ratings on 15 criteria.

8

The New Mexico rating allowed for responses of "fully meets," "partially meets," and "does not address" to the following attributes of the interim assessments: Standardized Periodic Assessment Data Format (ease of use; allows for disaggregation) Accessibility Ease of Assessment Technical Assistance Cost/Benefit Analysis Time Needed for Administration Linked (to NM content standards) Content Areas Immediate Feedback K-12 Assessments Professional Training Instructional Strategies/Implications Flexibility of Administration

The state's guide also gave a yes/no rating to other questions deemed "not critical": Possibility of a pilot Include support resources Provide oral expression and listening comprehension Finally, the guide includes open-ended responses to "strengths" and "weaknesses." A task force composed of staff from school districts and multiple bureaus of the SEA developed the criteria and rated the assessments. The New Mexico consumer's guide can be found at http://www.ped.state.nm.us/div/acc.assess/assess/dl/Formative_assessment_consumer_gu ide/Consumer%20Guide%20Final.pdf. It should be noted that the Consortium for Policy Research in Education (CPRE) indicated in 2007 plans for a consumer's guide to interim assessments. As of June 2008, CPRE has not released it, but interested readers are encouraged to visit the CPRE web site at http://www.cpre.org to track the release of the guide. Approved lists: the South Carolina example In 2006, the South Carolina legislature mandated the creation of a statewide adoption list of interim assessments. The enabling legislation requires that the assessments satisfy professional measurement standards and align with the South Carolina Academic Content Standards. The SEA implemented a two-stage process wherein submissions were evaluated for the professional measurement standards in the first stage and for alignment in the second stage. The state asked test publishers to propose assessments for the adoption list. Publishers needed to provide empirical evidence describing how using the product impacted student achievement on measures other than the product. A panel of measurement experts evaluated the submissions for the adherence to professional measurement standards in accordance with criteria jointly determined by the Education Oversight Committee (EOC) and the State Board of Education (SBE). The evaluation criteria were as follows: study design is experimental or quasi-experimental

9

the way(s) assessment was used to inform instruction is (are) adequately described sample and sampling or assignment plan are described and appropriate for study duration of study is indicated data analysis, including statistical techniques used, is adequately described study's findings and their practical significance are described study's sample size or repetitions are adequate study's findings adequately indicate that they are positive, negative, or show no demonstrable effect on student achievement report format meets criteria for length and font size statistical and psychometric information are adequate Products that met these minimum technical criteria advanced to a second stage, where their alignment with South Carolina content standards was thoroughly examined. Of the 11 vendors that submitted assessment information, only 2 met all of South Carolina's criteria and made the approved list. Information related to South Carolina's adoption of interim assessments can be found at http://ed.sc.gov/agency/offices/assessment/FormativeAssessment.html. (Note that the tests are referred to as "formative assessments" in many of the South Carolina documents.) Perie, Marion, Gong, and Wurtzel (2007) In their comprehensive look at the role of interim assessments in a comprehensive assessment system, Perie, Marion, Gong, and Wurtzel identify the general characteristics of any interim assessment that is to be used for instructional purposes. They include the following 11 characteristics: not all multiple-choice provision for qualitative insights about understandings and misconceptions and not just a numeric score immediate implications for what to do besides reteaching every missed item. rich representation of the content standards students are expected to master high quality test items that are directly linked to the content standards and specific teaching units a good fit within the curriculum so that the test is an extension of the learning rather than a time-out from learning a good fit with curriculum pacing so that students are not tested on content not yet taught clear reporting that provides actionable guidance on how to use the results validation of the uses of and information provided by the assessment administration features (speed, availability of normative information, customization, timing flexibility; adaptive) that match the assessment purposes professional development for teachers

10

The policy paper by Perie, Marion, Gong, and Wurtzel (2007) can be found at http://nciea.org/publications/PolicyBriefFINAL.pdf. There are few examples of education agencies or researchers developing and applying evaluation criteria to interim assessments specifically. Where assessment review criteria exist, they almost always apply to student assessments generally. High-quality interim assessments share many attributes with other high-quality assessments (e.g., linked to content standards, technical soundness, clear reporting). However, the examples in this section highlight some important test attributes that are more salient for interim assessments, specifically about how the results from these tests can alter instruction and curriculum. That is, the assessment results: Point to next steps in instruction and curriculum. Connect to materials or teaching units, including material for reteaching and remediation. Indicate whether students are on track to proficiency on the content standards. Can change the pacing and order of lessons. In turning to the case studies of jurisdictions using interim assessment, we will see that the extent to which these qualities are present can vary significantly. V. Case studies of jurisdictions using or supporting interim assessments-- methodology Jurisdiction-focused methodology One way to examine the range of interim assessment practice is to begin with the supply side. Such an examination would identify the test providers, whether their market shares are large or small. A complete look at the supply side of interim assessment would include the homegrown tests built by a state or school district. For several reasons, focusing on test publishers was not appropriate for this study. The study is intended to provide information to educators who are implementing interim assessment; looking at examples from the jurisdiction's, rather than the vendor's, perspective only makes sense. In this study, we focus on the goals, needs, and challenges for the jurisdictions implementing these tests and how individuals (i.e. the "culture"), tests, and resources changed (or did not) in response. Moreover, taking a vendor-based approach would risk bias (or the appearance of bias) regarding the test publishers. In trying to sample the interim assessments that are in use currently, it would be easy to leave out a vendor. In fact, even adopting our approach, namely, focusing on jurisdictions that were nominated by knowledgeable assessment practitioners, a test publisher could criticize that we overlooked the jurisdictions where their product is in use. However, by looking at nominations of jurisdictions that, in the professional judgment of state-level assessment leaders and other experts, have an interesting or successful interim assessment program, this study intends to be "vendor-neutral" and to examine the issues of implementing interim assessments from the jurisdiction's perspective. 11

Nomination/selection process TILSA's Interim Assessment subcommittee sought nominations for jurisdictions using interim assessment, preferably in an interesting way or with interesting results. Nominations were requested from the full TILSA SCASS. In all 20 jurisdictions were nominated prior to the February 2008 meeting of the Interim Assessment subcommittee. For each nominated jurisdiction, the author prepared a short summary of the interim assessment program. These summaries were mostly the product of internet research, supplemented by telephone calls. The subcommittee at its February 2008 meeting then reviewed each district. During the course of that meeting, seven other jurisdictions were nominated. Through a consensus discussion, the author and subcommittee agreed that a smaller set of jurisdictions should receive further scrutiny. The subcommittee members were asked to make a holistic judgment of interesting jurisdictions, guided by the following considerations: uniqueness of the jurisdiction's interim assessment program innovative practices or procedures choosing jurisdictions of different sizes and types (i.e., not all states, not all large districts) choosing jurisdictions that have some variety in length of time they have used interim assessments whether the program features multiple approaches to interim assessment availability of information about the interim assessment program and its impact intangibles such as favorable reports of working with the state/district. It is important to note that the set of jurisdictions was not restricted to districts in TILSAmember states. The members of the Interim Assessment subcommittee have the expertise and breadth of knowledge about national (and international) developments in assessment that could support an open process. Furthermore, the nature of the study did not require us to have deep access into schools, school districts, or state agencies, so there was no pressing need to restrict the study to TILSA-member states. Data sources Our understandings about the background, development, program details, and future plans for the various interim assessment programs came from three sources: publicly available documents other web-based information key informant interviews with the jurisdiction staff person responsible for the interim assessment program Interviews typically lasted between 35 and 60 minutes. The set of interview questions is attached as Appendix A. VI. Case studies of jurisdictions using or supporting interim assessments--results Following the nomination process, TILSA's Interim Assessment subcommittee discussed each nominee and evaluated how well it would fit into a mix of jurisdictions for the study. The jurisdictions were then contacted to see if they would be willing to participate

12

in the study. This evaluation and screening resulted in seven case study jurisdictions: three school districts, three state agencies, and one national department of education. School districts (3): Austin (TX) Independent School District (ISD), East Baton Rouge Parish (LA) School System, Natrona County School District #1 (Casper, WY) State educational agencies (3): Georgia Department of Education, Rhode Island Department of Education, Texas Education Agency (TEA) National department of education (1): New Zealand Ministry of Education A couple of notes about these examples are in order. Seven jurisdictions as diverse as these provide a broad look at interim assessment practice. However, the content area diversity is not so great: the huge influence of NCLB has focused the majority of the interim assessment activity on reading and mathematics. To promote a broader sampling in this study, our examination of the Texas Education Agency focuses on the Texas Science Diagnostic System (TSDS). In addition, the Rhode Island example stands apart from the rest because the agency is not directly conducting any interim assessment; the state's proficiency-based graduation requirements allow interim assessment evidence to be part of the demonstration of student proficiency, so the SEA is in more of a facilitating role. Study questions clustered around four topics, each explored with each of the case study jurisdictions. First, we sought background or general information about the interim assessment program, its origins, current, status, and future plans and directions. Second, we investigated program details, including professional development and training practices and how the program is funded. Third, we asked about how results are reported and used. Finally, we asked about evaluation and evidence that the program is meeting its goals. (Interview summaries will be published as a separate document and placed on the TILSA publications website. The contrasts and the similarities among the case study jurisdictions reveal a great deal about current interim assessment practice and issues that a district or SEA would be advised to consider. Background/General To establish the roots and goals of the programs, including plans for next steps, staff of the jurisdictions answered a series of background questions. The questions include how long the program has been in place; who developed the items, tests, and their delivery platform; what motivated the start of the program; whether the program is mandatory; and what are the plans or next steps for the program. The jurisdictions studied all have interim assessment programs begun during the last decade. Natrona County and New Zealand are the two oldest programs in the sample, with Natrona launching its program in 1999 and New Zealand the following year. The newest program in the sample is the Texas Science Diagnostic System, TSDS, which launched at the start of the 2006­07 school year. The other four programs started between 2003 and 2005.

13

The jurisdictions exhibit a range in how the items and their delivery platforms were developed. Half of the item management and assessment systems were built by the jurisdiction, and half by a commercial vendor. (Rhode Island's system, which empowers local interim assessment but does not bank or distribute test items, requires that assessments be locally developed.) Who developed and who manages the system seems to have been driven by a combination of cost and capacity within the organization. In Austin, Georgia, and New Zealand, there was computer programming and assessment capacity to construct the system, and building the system internally was judged to be a cost effective approach. East Baton Rouge, Natrona, and Texas all had funds to purchase products, and that was judged to be the better course. A demand for closer monitoring of students' progress was an important impetus for the development of these systems. The three school districts we interviewed all identified progress monitoring in preparation for high-stakes testing as a motivator of their local system. Austin ISD, in addition, cited a desire to improve data-driven decision-making. East Baton Rouge saw an interim assessment system as a way to ensure a standardized level of rigor across its school sites. In Georgia, local educators asked the state agency for more resources to prepare for testing, in particular tools that could help students practice for the high-stakes assessments. Texas' TSDS followed the well-received diagnostic system in mathematics. Both systems seek to provide teachers with more resources, including a larger pool of items of high quality. In the three school districts in our sample, the interim assessments are administered district-wide and are mandatory at the tested grades. In Rhode Island, the new graduation requirements, where students must demonstrate proficiency, are required; however, use of interim assessment is one option for local school districts. In both New Zealand and Texas, the interim assessments studied are viewed as a resource to schools and are not mandatory. Georgia's system began in the same way, but administration of interim assessments is now a required activity in contract-monitored schools, and for needs improvement schools, depending on their program, interim assessments may be mandatory. Local interim assessments in Rhode Island have a distinct history and flavor. The motivator in Rhode Island was the view that information from periodic local assessments provides one source of data that can contribute to an overall appraisal of student learning and knowledge. Local assessment is one strand of the state's larger implementation of proficiency-based graduation requirements (PBGR). With this history, Rhode Island stands apart from the five other U. S. jurisdictions in our study; NCLB seems at best an indirect motivator of the interim assessment system, and even that may be a stretch. New Zealand's system and history also stands apart from the other jurisdictions. In federal discussions during the late 1990s, national testing in New Zealand was given serious consideration. Ultimately, strong local accountability pushed federal policy away from a pure outcomes focus driven by the education ministry to a process focus of setting a vision and building supports for teacher practices and student learning. The resultant

14

assessment system, Assessment Tools for Teaching and Learning (asTTle), seeks nothing short of "a new vision of teacher practice." The range of next steps identified by the jurisdictions provides a telling snapshot of where interim assessment systems are in 2008. Two of the jurisdictions cited moving away from paper-and-pencil testing to online administration as a next step. A third jurisdiction, a school district that has had online administration for the last several years, spoke to the importance of letting recent changes in district assessment practice take hold; in this district, no "next steps" were identified. New Zealand would like to build out its capacity to administer computer adaptive tests. To do this, it will need to expand its item bank. Along with New Zealand, Georgia identified building out its item bank as an important next step. The Texas interim assessment in science needs to build out more links to resources, according to the TEA staff person we interviewed. In addition, educators in Texas have requested that the state develop the TSDS with a Spanish language option. A summary table of interview responses to background questions is below. Background/General E. Baton Georgia Natrona New Rouge County Zealand

2004 Districtwide test bank for students in grades 2­8 and high school (certain courses) to take tests two to three times during the year. Edusoft Increase rigor; prepare for state test Yes 2005 State-wide item and test bank that allows teachers to create tests (or tests can be built by SEA staff). 1999 Districtwide test bank for students in grades 2­8 and high school, generally offering tests twice a year. 2000 National item and test bank that builds tests ondemand in response to parameters entered by teacher.

Austin ISD

History Description 2003 District-wide testing program administered three times per year (beginning, middle, and end) to grades 3­10.

Rhode Island

2003 regulations State-wide high school graduation program including as one possible component the results of locally developed interim tests. Local Part of graduation by proficiency No; the larger graduation program, not interim assessments specifically, is mandatory

Texas (Science)

2006 State-wide item and test bank for grades 4­8 and coursedependent at high school.

Developer

Homegrown Progress monitoring, communication, data-driven decision making Yes

Homegrown Practice; assistance to districts in improve Yes, for certain schools in improvement

Motivation / Goals

Northwest Evaluation Association Progress monitoring

Homegrown New vision of teacher practice

Vantage Standardize item quality; more teacher resources No; considered a resource

Mandatory?

Yes

No; considered a resource

15

Austin ISD

Next Steps Increase item bank; move to online

Background/General E. Baton Georgia Natrona New Rouge County Zealand

Move to online More staff; more items; ongoing align review None Expand to early grades; computer adaptive testing

Rhode Island

Science and other subjects

Texas (Science)

Links to resources; Spanish

Program Features In addition to documenting the history, motivation, and other background about the interim assessment programs, interviewees were asked about the basic workings of the program: annual number of administrations of the interim assessment, nature and extent of professional development, and flexibility of the system to allow teachers to select items and generate tests. In most of the jurisdictions we studied, there were not a fixed number of administrations per school year. In East Baton Rouge and Natrona County, where the districts contract with vendors, the district appears to be on a more fixed cycle, with interim assessments administered once in the fall and once in the spring. Austin and East Baton Rouge use curriculum pacing guides, and the assessments are scheduled accordingly. However, even in those districts, teachers can and do use the interim assessments more frequently than scheduled. In the other jurisdictions, there is some range of practice, with interim tests generally offered two to six times annually. In several jurisdictions, the system is always available to teachers, who can download tests as they see fit. Staff in the study jurisdictions view professional development as a critical piece of the interim assessment initiative. The professional development activities focus on two kinds of training: navigating and operating within the system, and using data to make educational decisions and improve student performance. Georgia's training focuses on the more practical aspects of using its Online Assessment System (OAS). Both East Baton Rouge and Natrona County identified the value in having onsite staff with specialized training and knowledge in the assessment system or data-driven decision making. In Natrona's case, interim assessment does not get the highest priority, as the district assessment staff have a limited amount of time to address federal accountability requirements, state testing, and the interim assessment. In Austin, the district provides training in using data to make instructional decisions, and district staff believe this is a key piece of the Austin system. In Texas, training in the TSDS is not carried out by the TEA, but rather by a network of regional collaboratives, the Texas Regional Collaboratives for Excellence in Mathematics and Science Teaching. While offering training in the TSDS is ancillary to their main professional development, they assist teachers in understanding and making full use of this tool. Professional development in Rhode Island has not addressed interim assessment specifically, but rather has explained the larger program of proficiency-based graduation requirements. New Zealand's delivery of professional development (PD) about the asTTle interim assessment system features trainers who are experts in the nation's Assessment to Learn 16

initiative. In one sense, that the interim assessment training is part of a broader focus, the New Zealand and Texas PD contexts are similar. However, the New Zealand approach to assessment, where formative applications are embraced, and summative roles downplayed, is worlds apart from Texas and the other U. S. jurisdictions we examined. A common thread in the interim assessment programs we examined is their degree of flexibility, which outpaces what most teachers can use. In the systems of Austin, East Baton Rouge, and Georgia, as well as in the science diagnostic system in Texas, teachers have the flexibility to create tailored tests. However, the teachers prefer to use systemgenerated tests. Austin, for example, reports minimal use of teacher-created tests. This may not be a bad thing. The interviewees report that teachers in their jurisdictions are generally satisfied with the tests created by the system. In New Zealand, teachers input the content standards to be assessed, and the system builds the test. Program Features Georgia Natrona New Rhode County Zealand Island

3­4 times Usually 2 times, sometimes 3 times Uneven; coaches a positive addition Adaptive, but not teachercreated 2­5 times; system always available Extensive Collection of evidence, so it varies General technical guidance on graduation requirements Total; local assessment

Austin ISD

Administrations per Year 3­6 times

E. Baton Rouge

Usually 2 times, but system always available Onsite coordinators

Texas (Science)

Determined locally; system always available Regional centers, not state

Professional Development

Flexibility to Create Tests

Campuslevel trainings; data-driven decision making Limited; classes follow pacing

Training in using and navigating the system Teachers can create; state can build tests

System- and teachercreated tests; item authoring

Tests created based on teacher input

Teachers can create

Reporting Most of the case study jurisdictions have performed at least some comparison of interim test results to performance on the high-stakes assessment used for NCLB. Austin conducted a two-year study using 2004­05 and 2005­06 data that focuses on correlating midyear interim assessment performance and performance on the Texas Assessment of Knowledge and Skills (TAKS). Although the Austin study apparently shows a link between the midyear interim test and TAKS, as of June 2008, the report was not made publicly available nor made available to this author. The Georgia study has a different twist, as it looked at usage levels of the online assessment system (OAS) and correlated them with spring operational performance on the Criterion-Referenced Competency Tests (CRCTs). It should be noted that the Georgia exploration of a correlation is far from perfect. Many districts cannot administer the test online, but they print it. The recording system views this as one use of the test, but since that use leads to printing and copying for a large number of students, the recorded usage may understate the number of students affected. In an unpublished study, Natrona County found that correlations between NWEA MAP and the Wyoming statewide test were over 0.75. New Zealand carried out

17

extensive studies as it was designing and refining its system and later studied the relationship between use of asTTle and more advanced teacher practices. Reporting Georgia Natrona New County Zealand

No Yes NA

Austin ISD

Results compared to NCLB test? Results compared to other outcomes? Yes; to TAKS No

E. Baton Rouge

No

Rhode Island

NA

Texas (Science)

No

No

Yes; interim results correlated with usage of system

No

Yes; interim usage correlated with teaching mastery and practice

No; though by design, interim results are part of graduation

No

Feedback and Evaluation Though the present study is not an evaluation of the interim assessment programs in the case study jurisdictions, we asked the jurisdictions what evidence they had that interim assessment has brought about changes in teacher practice. Where there was evidence, it was nearly always anecdotal. Georgia, New Zealand, and Texas cited the demand from the field for more items and tests as a sign of success. New Zealand also cited seven years of federal funding as evidence that the assessment system is meeting its goals and those of the Ministry of Education. The jurisdictions reported that interim assessment had been responsible, at least in part, for a number of notable benefits in the classroom, including: greater teacher understanding of grade level expectations (E. Baton Rouge) expanded discussion and collaboration around data use among teachers (Austin, Georgia) more thinking about the alignment between teaching and standards (Natrona, Rhode Island) more effective communication about teaching goals (E. Baton Rouge), and evidence-based student placement decisions (Natrona) These are potentially powerful outcomes, and their more rigorous study, including better documentation of the conditions that lead to them, could bolster the case for interim assessments. New Zealand stands out for the volume of studies of its assessment system. The asTTle program, through its administrators at the University of Auckland, as well as via other researchers and entities, is the subject of literally dozens of technical reports that examine the relationships between interim assessment and teacher practice. Notably, asTTle is part of a national assessment system that positions teachers as change agents who are wellequipped "with a self-managed, curriculum-based set of tools calibrated to the curricula and to appropriate performance norms" (Brown and Hattie, 2003, p. 9). New Zealand does not have a high-stakes national test, to which interim assessment results can be compared. In fact, relative to the United States, there is little high-stakes

18

testing. Therefore, the vast array of impact studies of the assessment system, asTTle, relate to changes in teacher practice. In one of the many technical reports on asTTle: "[t]he respondents claimed that asTTle would have major effects on their teaching and planning, on the way they assessed their students, and on communicating with students, management, and parents. They noted the power and value of new information now available on performance relative to national norms, student leaning needs (strengths, weakness, etc.), and appropriate teaching tools .... The strengths of asTTle were the quality and type of information, the ease of use, the flexibility of the tool, and the content of the tests. The weaknesses mostly related to time needed, the costs of photocopying the tests, data transfer, and some technical concerns." --Ward et al., 2003, p. 36 The evidence base for the effectiveness of interim assessments rests in large part on anecdotes. There are not definitive studies showing gains in student performance that can be attributed to use of interim assessments. However, the New Zealand example does provide a set of reports that link use of the assessment system to changes in teacher behavior. Feedback and Evaluation Georgia Natrona New County Zealand

High demand from field; more teacher collaboration on data use Success stories regarding placement, communication; more teacher reflection Multiple renewals of federal support

Austin ISD

Outcomes, including anecdotal More discussion of data

E. Baton Rouge

Rhode Island

Promoting thinking regarding alignment

Texas (Science)

Demand for new items

Documented evidence of changing teachers' practice?

None

More classroom rigor; informs whether students understood unit; greater teacher understanding of grade level expectations None

System usage levels and patterns

None

Extensive (60 reports on asTTle)

None

None

VII. Further Research There are many aspects of interim assessment that are ripe for further study. Although there have been a number of positive reports about how interim assessments have changed teacher behavior, the hard evidence is scarce. In the present study, district and state staff of nominated jurisdictions spoke about teacher practice. In light of the TILSA SCASS focus and resources, such an approach has appeal. On the other hand, it should not be mistaken for a validation or demonstration of interim assessment's consequences for instruction. Other investigations could be designed and carried out--including data analysis, survey of principals and teachers, teacher focus group and observation--that

19

could support or disconfirm the claims that jurisdiction staffs have made on behalf of interim assessments. This report gives much needed insight into how newly adopted interim assessment programs have been received. This is in keeping with the mission, scope, and purview of TILSA. However, ultimately in the classroom, the implementation, use of data, and above all, effectiveness of these programs are of greater importance. Because we were unable systematically to talk with teachers, we are missing their critical insight into the implementation, use of data, and effectiveness of interim assessments. VIII. Conclusion: possibilities and challenges Interim assessments are being used in many districts and states, and, as we have seen, even national education agencies are facilitating their use. Jurisdictions put these instruments to a variety of uses, but the periodic monitoring of student progress is the essential requirement of interim assessments. There is a range in the degree to which the tests are linked to tools for remediation and reteaching. State roles vis-à-vis these assessments can range dramatically. Many states do not actively assume any role regarding interim assessments, viewing the question of their use as a district decision. Some states provide general information without conducting a systematic review of interim assessment products. Other states solicit submissions for an approved list or consumer's guide process that can assist districts with purchasing decisions. In a growing number of states, the state agency itself is constructing an item bank as a resource for schools and school districts; use of this "resource" may be mandatory for schools or districts that are not meeting accountability requirements. Where agencies have established criteria to rate the quality of these instruments, some common themes emerge. Like effective assessments generally, effective interim assessments have technical quality, are aligned with content standards, and produce results in reports that are easy to interpret and understand. However, interim assessments have other criteria that are more uniquely theirs: results indicate next steps in instruction, by connecting to materials or teaching units that develop or solidify understandings or that offer opportunities for remediation or re-teaching. Effective interim assessments also indicate whether students are on track to proficiency. Educators have received these tests with a mix of skepticism, curiosity, and enthusiasm. Particularly in the jurisdictions profiled in this report, and with the stakeholders interviewed in this study, the interim assessments are viewed favorably. There is a common belief, among this small sample, that these assessments have benefits such as improving teacher practice, promoting discussions about student performance data, and identifying earlier those resources that can help bring performance up to a proficient level. They are viewed as an important tool at the disposal of a school district or state agency. However, interim assessments are not without their challenges. In pulling together items for an interim assessment bank, jurisdictions need to pay close attention to items'

20

alignment with content standards. In addition, interim assessments are most effective when there is strong teacher professional development that leverages the tests to motivate teachers to adjust their classroom practices. To the extent that interim assessment data is used to identify and focus upon students who are near the cut point for proficiency, the possible benefit of the tests is co-opted in the quest for a higher accountability rating. Finally, interim assessment results do not always align with what teachers see in the classroom, and when that happens, the mixed signals to teachers and parents can be frustrating. A cautionary ending is in order. This study focused on jurisdictions that were identified as being innovators or successful, interesting, or thought-provoking implementers of interim assessments. Even in these examples, there were struggles and challenges. Teacher buy-in was in some cases slow in coming. Jurisdictions found large subsets of the item bank not to be aligned with applicable content standards. Logistical difficulties and costs, such as photocopying and computer system downtime/failure, caused frustration among staff at school sites and in the central office. Interim assessments are a promising source of information for teachers, administrators, parents, and students; however, they are not by themselves a panacea that can ensure growth in student performance. Acknowledgments Special thanks to a working group of Fen Chou, Duncan MacQuarrie, Scott Norton, Marianne Perie, Doug Rindone, and Carole White. Interim Assessment subcommittee of TILSA, as well as the following individuals: John Cronin Janet Haas Gage Kingsbury Amy Marsman Dale Russell References Brown, G. T., & Hattie, J. A. (2003). A national teacher-managed, curriculum-based assessment system: Assessment Tools for Teaching & Learning (asTTle). Project asTTle Technical Report 41. University of Auckland/Ministry of Education. Retrieved April 24, 2008, from http://www.tki.org.nz/r/asttle/pdf/technical-reports/techreport41.pdf. Brown, R. S., & E. Coughlin. (2007). The predictive validity of selected benchmark assessments used in the Mid-Atlantic Region (Issues & Answers Report, REL 2007­No. 017). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Mid-Atlantic. Retrieved November 27, 2007, from http://ies.ed.gov/ncee/edlabs. Crane, E. W., Rabinowitz, S., & Zimmerman, J. (2004). Locally tailored accountability: Building on your state system in the era of NCLB. [Knowledge Brief] San Francisco:

21

WestEd. Retrieved March 11, 2008, from http://www.wested.org/online_pubs/KN-0401.pdf. Council of Chief State School Officers. (2008). Attributes of effective formative assessment: A work product coordinated by Sarah McManus, NC Department of Public Instruction. Washington, DC http://www.ccsso.org/publications/details.cfm?PublicationID=362 Gallagher, C., & Worth, P. (2008). Formative assessment policies, programs, and practices in the Southwest Region (Issues & Answers Report, REL 2008­No. 041). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Southwest. Retrieved February 27, 2008 from http://ies.ed.gov/ncee/edlabs. Georgia Department of Education. (2008). Standards, Instruction and Assessment: Testing. Atlanta, GA: Author. Retrieved March 17, 2008, from http://www.doe.k12.ga.us/ci_testing.aspx?PageReq=CI_TESTING_CRCT. Henderson, S., Petrosino, A., Guckenburg, S., & Hamilton, S. (2007). Measuring how benchmark assessments affect student achievement (Issues & Answers Report, REL 2007­No. 039). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Northeast and Islands. Retrieved January 7, 2008, from http://ies.ed.gov/ncee/edlabs. Institute of Education Sciences (IES). (2008). Welcome to REL. Washington, DC: Author. Retrieved January 7, 2008, from http://ies.ed.gov/ncee/edlabs. New Mexico Public Education Department. (2006). New Mexico Public Education Department Division of Assessment and Accountability consumer guide to formative assessments. Retrieved November 13, 2007, from http://www.ped.state.nm.us/div/acc.assess/assess/dl/Formative_assessment_consumer_gu ide/Consumer%20Guide%20Final.pdf Perie, M., Marion, S., & Gong, B. (2007). A framework for considering interim assessments. National Center for the Improvement of Educational Assessment, Dover, NH. Retrieved November 13, 2007 from http://www.nciea.org/publications/ConsideringInterimAssess_MAP07.pdf. Perie, M., Marion, S., Gong, B., & Wurtzel, J. (2007). The role of interim assessments in a comprehensive assessment system: A policy brief. Washington, DC: The Aspen Institute. Also available at http://nciea.org/publications/PolicyBriefFINAL.pdf and http://www.aspeninstitute.org/atf/cf/%7BDEB6F227-659B-4EC8-8F848DF23CA704F5%7D/ed_PolicyBriefFINAL.pdf.

22

Stiggins, R. (2004). New assessment beliefs for a new school mission. Phi Delta Kappan, Vol. 86, No. 1, September 2004, pp. 22-27. Retrieved January 4, 2008 from http://www.assessmentinst.com/forms/NewBeliefs.pdf. Ward, L., Hattie, J. A., & Brown, G. T. (2003, June). The evaluation of asTTle in schools: The power of professional development. asTTle Technical Report, #35, University of Auckland/Ministry of Education. Wiliam, D. (2006). Formative assessment: Getting the focus right. Educational Assessment, Vol. 11, Issue 3 & 4, January 2006, pp. 283-289.

23

Appendix A: Interview Questions Background/General B1. What is your title or role? B2. How long has the program been in place? What motivated the start of the interim assessment program? B3. What are the agency's purposes and goals for the assessment program? B4. How has the system evolved, and what would the agency like to do next? B5. What's the timeline for new phases of the interim assessment program, if any? B6. Is the interim assessment program mandatory? B7. What are the participation rates in using interim assessment in your district? Are rates rising? Program Details P1. What professional development is offered related to the interim assessment? P2. Are teachers given planning time specifically for activities related to the interim assessment? P3. What are the agency's training, administration, and reporting practices? P4. Are the specifics on implementation documented? Where?/Can you send? P5. How is the IA program paid for? P6. If you purchased a test, whose did you buy, and what is the cost per pupil? What is the budget for interim assessment activities? P7. Where do the items come from? Who writes the items? P8. Are there tools or ancillary materials, or is it a test or item bank? P9. How often are interim assessments administered? Reporting R1. How are results reported? R2. How do teachers, building administrators, and agency administrators use the results? R3. What is feedback to students like? R4. Have you attempted to link the IA to other outcomes (the statewide test)? Feedback and Evaluation F1. What is the evidence that the program is meeting agency goals? F2. How is instruction affected? How do you know? F3. How is student performance affected? How do you know? F4. What do parents think about the interim assessment? Are they aware of it? How do you know? F5. Has there been resistance from teachers? From anyone else? How do you know? F6. How do you judge whether the assessment program has been effective? Do you have hard data? Any reports available? F7. Any other anecdotes?

24

Appendix B: Interview Summaries NOTE: Interviews with staff of the various jurisdictions we examined were not recorded. We took notes during the calls and compiled those notes into interview summaries, which have been reviewed for accuracy by the interviewees. After pilot testing the interview questions with a school district that was not selected for case study, we rejected a structured interview approach in favor of a conversational interview that would address all the questions, but not in a predetermined order. As a result, we exercised judgment in placing interview responses that could arguably go in more than one place. The interview summaries are in a separate companion document titled "Appendices: Interview Summaries for the Interim Assessment Practices and Avenues for State Involvement" paper. The appendices are organized as follows: Appendix B1 Austin Independent School District Appendix B2 East Baton Rouge Parish School System Appendix B3 Georgia Department of Education Appendix B4 Natrona County School District #1 Appendix B5 New Zealand Ministry of Education Appendix B6 Rhode Island Department of Education Appendix B7 Texas Education Agency

25

Information

TILSA SCASS Subcommittee on Formative Assessment

27 pages

Report File (DMCA)

Our content is added by our users. We aim to remove reported files within 1 working day. Please use this link to notify us:

Report this file as copyright or inappropriate

992163


You might also be interested in

BETA
Arkansas.rtf
6281 PosterBook/v2
Microsoft Word - EOG Science FAQ Document updated Jan09.doc
Mathematics, Grade 7 - Approved Listing