Читать книгу Collaborative Common Assessments - Cassandra Erkens - Страница 12

Оглавление

Embedding Collaborative Common Assessments in a Balanced Assessment System

If you want to make beautiful music, you must play the black and the white notes together.

—Richard M. Nixon

The concept of collaborating sounds pleasant, but it takes considerable effort and commitment on behalf of the participants, and it can only happen when the assessments used to monitor results are carefully embedded in a healthy context and balanced assessment system. Engaging in collaborative common assessments requires systems thinking. When embedded and aligned to the greater context of classroom and district assessments, the common assessment process is guaranteed to intersect and impact classroom instruction, teamwork, school culture, and school improvement initiatives in parallel and positive ways. To maximize the potential of the common assessment process, educators must begin to think like architects with a deep understanding of all the systems involved.

Assessment Architects

Assessment is so much more than writing, employing, and then scoring a traditional test. There is a structure to the overarching system of individual assessment events or experiences. Educators must work as assessment architects—sometimes individually and sometimes in teams—as they structure learning progressions; select, modify, or create assessments; design accompanying tools and resources (rubrics, proficiency scales, protocols, templates, and so on); deliver assessments; score with accuracy and consistency; provide productive feedback; respond in instructionally agile ways; report results; and ultimately build a culture to promote continued and sustained learning over time. Each task is an entire system in and of itself, and a change in one system will likely have an impact on the other systems.

Even though literature about the need to design backward has been plentiful since the 1990s, assessment is still treated as an afterthought (designed the night before the test or two weeks before the final exam) in too many classrooms and schools. This practice is akin to building a house and then deciding that using architectural blueprints might have been helpful to the construction. Constructing a house without blueprints is ludicrous; teaching to standards without knowing the final result in advance is equally ludicrous. Assessment can never be an afterthought. Instead, assessment must lead the work of curriculum selection and instructional planning. In the house metaphor, assessment, then, becomes the architectural blueprint from which the entire house is built. The standards serve as the specifications that inform the design, the assessment map of formative and summative assessments becomes the architectural blueprint to lead the design work, the curriculum becomes the brick and mortar as it makes the standards a reality, and the instruction—last in the design list—becomes the artistry that makes each house unique: the colors, the textures, the décor. This picture of where assessment belongs in the sequence (standards—assessment—curriculum—instruction) is not new; it has been around since 1998 with Grant Wiggins and Jay McTighe’s book, Understanding by Design. Education has simply been slow to change practice.

Why does it matter? If educators do not become assessment literate, functioning as architects who put the assessment process in its proper place with attention to detail, they run the risk of any—or all—of the following happening.

• Inaccurate assessments

• Invalid results

• Distrust of the system and the individuals who work in it

• And, worst of all, disengagement on behalf of learners

The costs are grave.

Assessment is teaching. To teach without engaging in profound and accurate assessment processes, day by day and moment by moment, is to engage in curriculum coverage. The measure of teaching must be based in whether or not the learning happened. The only way to ensure learning happens is to design the architecture of assessments and assessment processes, from preplanned formal assessments to in-the-moment unobtrusive assessment processes, which scaffold a teacher’s way to success. The expression “Begin with the end in mind” is insufficient; educators at all levels of the organization must always begin with ideas on how to measure the end they need to have in mind.

It is so important to stop thinking of assessment as a test or a single experience. Likewise, it is equally important to consider teachers as assessment architects rather than parcel out the individual assessment roles, such as test writer or data analyst, on an as-needed basis. Architects are highly trained individuals who must engage in systems thinking with large constructions and intricate details. To create a single freestanding and safe structure, they must adhere to rigorous standards, follow the principles of design, plan for functionality and creativity, and then monitor progress along the way as the building eventually takes form and stands independently.

Assessment architects must create and navigate an entire system of assessments. As contributing experts to the SAGE Handbook of Research on Classroom Assessment, Christina Schneider, Karla Egan, and Marc Julian (2013) write, “In a balanced assessment system, teachers use classroom assessment, interim assessment, and year end assessments to monitor and enhance student learning in relation to the state standards and to the state’s goals for student proficiency” (p. 61). As assessment architects, teachers must understand how the entire system stands together, supports learning, and verifies achievement.

The System of Assessments

It could be said that collaborative common assessments were born out of a need to better prepare learners for upcoming interim assessments, which then strive to better prepare learners for external large-scale assessments. However, it’s time to turn the tables. In a far better approach, teams would generate rigorous and engaging collaborative common assessments that capture the heart and soul of their vision of success for learners. Teams would then use internal and external large-scale assessment data for validation that their local assessment efforts are aligned and equitable in regard to a shared set of standards and criteria for quality. In this model, the paradigm is inverted so that collaborative common assessments become the ceiling, while large-scale assessments become the floor or foundation to ensure quality from system to system. The nuance seems slight—like the difference between teaching mathematics to students and teaching students about mathematics—yet the change in focus is dramatic. The energy shifts from reactive to proactive, allowing for hope and passion to be rekindled for educators.

Still, the process of engaging in collaborative common assessments must involve balancing the entire system of assessments, from formative to summative and from classroom-level to external large-scale assessments. It requires teams to develop assessment literacy as they work together to explore learning throughout the instructional journey in all of the following ways.

• Exploring standards to identify specific learning expectations

• Creating an assessment pathway, rich with formative and summative assessments, and identifying which ones will be common along the way

• Writing assessments or reviewing and endorsing assessments in advance of instruction

• Aligning, modifying, and enhancing the curriculum resources to support students in acquiring the standards

• Providing targeted, responsive instruction aimed at helping learners develop the necessary skills and knowledge for success on the preplanned and preapproved assessments

• Exploring data—formative and summative, qualitative and quantitative—to understand the impact of their craft, identify nuances in results, and problem solve any detected gaps early on

• Examining student work to conduct error analysis and inform immediate next steps

• Examining student work to collaboratively score work and calibrate expectations to be exact and consistent

• Responding instructionally and in a timely manner with meaningful enrichments and targeted instructional strategies to re-engage learners in the learning expectations

• Monitoring for progress and celebrating successes along the way

The entire journey is collaborative and requires the full attention of all members of a teaching team in partnership with the greater context of the school, the district, and even the state. If teaching is less about coverage and more about learning, then the entire process requires that all eyes examine current practices in light of specific results, and all team members contribute to developing craft knowledge on how to accomplish such a complex, demanding task. No part of the team’s journey can be usurped by another part of the organization, such as the district, substituted by a ready-made assessment, or left to the machination of a handy number-crunching algorithm.

External Large-Scale Assessments

Large-scale assessments, sometimes known as end-of-year assessments, should provide the necessary components of measurement, confirmation, and results. Organizations—especially publicly funded organizations with a captive clientele—have an obligation to monitor or measure their effectiveness, to share their findings with their stakeholders, and to address any identified needs as they emerge. Data in isolation shape opinions. A teacher, a school, or an entire district can generate internal evidence that learners are achieving at high levels because they are all earning superior marks, but how do the criteria employed fare against the standards measured on a larger scale? Data in comparison create information. Organizations cannot make quality program improvements without such information.

Large-scale assessments are typically offered annually or bi-annually to help educational organizations identify the proportion of students mastering a given set of standards and then evaluate and address the institutional impact on student learning for the purposes of improving learning for all (Schneider et al., 2013). When large-scale assessments are employed in a criterion-referenced manner against a shared set of standards, educators can use the data formatively to monitor their current reality, explore areas that require attention, and ultimately ensure equity and success for all. Schools cannot improve if they do not have evidence and data regarding what and how well students learn.

The question isn’t “Do we need large-scale assessments?” Rather, the question should be “Are large-scale assessments working in a manner that is accurate, supportive, and valuable to the schools and learners they impact?” A single test cannot cover everything, and so tests are used to gather random samples of domains of interest. Consequently, the summary of the findings can help schools and districts monitor for success. Internationally recognized assessment expert Dylan Wiliam (1998) states:

It has become increasingly clear over the past twenty years that the contents of standardised tests and examinations are not a random sample from the domain of interests. In particular, these timed written assessments can assess only limited forms of competence, and teachers are quite able to predict which aspects of competence will be assessed. Especially in high-stakes assessments, therefore, there is an incentive for teachers and students to concentrate only on those aspects of competence that are likely to be assessed. Put crudely, we start out with the intention of making the important measurable, and end up making the measurable important. The effect of this has been to weaken the correlation between standardised test scores and the wider domains for which they are claiming to be an adequate proxy. (p. 1)

Assessment is a tool, and any tool can be used to either build something up or tear something down. While large-scale assessments can be used to help build better educational systems, they have sometimes been used in destructive ways. When the tests are shallow, the results are norm referenced, or the stakes are high, the costs can be significant.

Shallow Testing

The items on any test provide, at best, a representative sampling of what a student knows at any given moment in time. How valid and reliable are the assessments themselves?

Several studies, using several different methodologies, have shown that the state tests do not measure the higher-order thinking, problem-solving, and creativity needed for students to succeed in the 21st century. These tests, with only a few exceptions, systematically over-represent basic skills and knowledge and omit the complex knowledge and reasoning we are seeking for college and career readiness. (Resnick & Berger, 2010, p. 4)

Assessments are shallow when they simply test knowledge or basic application through algorithms or procedural knowledge. If the answer to the assessment test questions or performance prompts could be googled, it probably shouldn’t be on a summative assessment. Knowledge is necessary for reasoning, and it’s helpful to check for understanding in the formative phases. But by the time learners are immersed in a summative experience, they should be applying the knowledge and reasoning in meaningful ways—to solve a problem or create something new. Summative assessments need to be robust, engaging learners in provocative tasks that require deep thinking, the application of skills, and practice with 21st century-like experiences.

Norm-Referenced Assessments

Norm-referenced tests are employed to draw comparisons. When based on criteria, they are called criterion-referenced assessments, and such assessments work in a standards-based system. However, when they compare rank order of individual performance and generate the well-known bell curve, they are called cohort-referenced assessments. These assessments do not work in a standards-based system because they measure learners against each other instead of measuring learners against a set of standards. Imagine that all of the learners are getting As, but the rules state that not everyone can get an A; now the A is sliced into whose A is higher versus whose A is lower, and the results are reported in a manner that shows who is at the top and who is not. Now the A learner who is at the bottom of the As is recognized as less than his or her peers and is not acknowledged for mastery of the expectations—which is the intended message of the A itself.

The results of schools are often normed as well, creating winners and losers in an accountability system that demands everyone be winners. The practice of norming something (customer service, marketing, scheduling, policies, and so on) is best kept as an internal decision-making strategy; in other words, an organization might norm a common practice within the industry to separate “better” from “best” and then make appropriate decisions about what it can do to improve. But norming is not appropriate as an assessment strategy, especially in a standards-based system, because it labels and sorts people.

High-Stakes Assessments

An assessment is considered to have high stakes when it generates significant consequences—positive or negative—for stakeholders. High-stakes assessments are not appropriate in a system of compulsory education: a system in which education is imposed by law, thus making participation mandatory. The concept of high stakes is patterned after credential programs in which individuals or organizations must meet certain criteria to earn or maintain licensure (such as getting a driver’s license or becoming a doctor, lawyer, pilot, teacher, or any other licensed professional). Unfortunately, the fundamental difference between credential systems and compulsory education is choice.

Credentialing works in the system of choice because the risks are lower. First, the individual chose to participate, so he or she is highly engaged during the learning, motivated to succeed, and willing to persist in multiple attempts if needed. Second, the risks are significantly reduced: should the learner fail, he or she is often offered multiple re-testing opportunities. Like the license to drive, the license to practice in a profession is open to prospective candidates on a repeat basis. And, because those studying to pass such exams often have a college degree already behind them, the absence of the licensure may cause initial unhappiness or discomfort, but it will not be detrimental to his or her future opportunities or overall success. There are other career pathways—often within the same field of interest—available. However, in a compulsory system where choice is removed, the difference can be crippling: the risks are too high, and failure eliminates future pathways entirely. The learner who does not pass high school is severely limited in future career pathways, and the stigma is too debilitating for many. Motivation and efficacy—the ingredients for success in life—can be irrevocably impaired when high stakes are applied to compulsory experiences.

All in all, when large-scale assessments are low in quality or are issued in norm-referenced or high-stakes environments, the only changes they inspire are superficial and short lived. Worse, teachers engage in teaching to the test based on specific content rather than increasing rigor or engaging in sustainable, quality instructional practices. Such a system creates visible winners and losers for both educators and their learners. What’s left behind is the invisible, residual, but palpable impact of a fixed mindset for both the losers and the winners.

Internal or Medium-Scale Assessments

Though collaborative common assessment work is exclusive to individual teams, the practice of common assessments is not. Schools and districts have an obligation to make certain their learners are ready to perform well on outside measures and are receiving equitable educational experiences and background from school to school within a larger district. Educational leaders at the district level often strive to create common assessments that will monitor progress along the way in all of the tested areas: reading, writing, mathematics, and sometimes science and social studies. Interim assessments, often known as progress monitoring assessments or benchmark assessments, are assessments given over time and with equidistant spacing (every six weeks, end of every quarter, each trimester, and so on) to monitor student progress in achieving standard expectations on a districtwide basis. The primary function of such assessments when developed within a school system is to offer teachers and administrators information about student readiness for upcoming large-scale assessments. The big picture of when the various assessments take place in the system can be found in figure 2.1.

Figure 2.1 is merely an example of a typical system—it does not provide a recommendation regarding what should be happening with specific assessments. The majority of large-scale assessments are given two-thirds to three-quarters of the way through the school year, but there are exceptions. Some states or clusters of states will conduct large-scale testing in the fall as an indicator of student readiness for the upcoming year. The most popularly tested areas include reading, writing, mathematics, science, and social studies, but again, not all states or provinces test the same areas. Figure 2.1 shows districts or boards using similar testing patterns on a more frequent basis than the annual state or provincial testing systems. Building- and team-based assessments are conducted on a much more frequent basis and should certainly include both formative and summative options.

Note: The model depicted in this example is not suggesting that there should be two formative assessments and one summative assessment for each unit; rather, it is suggesting that in a balanced assessment system, there are far more formative assessments than summative assessments.

Figure 2.1: An assessment system example.

When it comes to designing healthy and balanced assessment systems, schools and teams should avoid adopting rules and patterns that oversimplify matters; for example, there is no such rule as needing to use two formative assessments and one summative assessment in every cycle of learning. There is, however, an expectation that teams use far more formative assessments than they do summative assessments in their team and individual classroom practices. At the classroom level of the diagram, Xs denote all kinds of ongoing assessments because teaching is assessing. The assessments at this level can range from the very informal, overheard student frustration, to the data following a formal exam. Classroom assessments cover all topics, all purposes, and all ranges of methods.

Essentially, figure 2.1 illustrates how assessment systems must be aligned. If the entire system is to work, there must be an aligned flow—from top to bottom and bottom to top, between classroom assessment and outside assessment monitoring systems. Unfortunately, many systems get stuck when outside, large-scale assessments and the metrics they employ to measure success drive the entire system.

District testing, sometimes referenced as benchmark or interim testing, is an important part of the assessment system. It is the only way to ensure there is a guaranteed and viable curriculum firmly in place across the organization. When used, however, district assessments should support teaching teams in their work with collaborative common assessments by providing systemwide validation and by identifying target areas that further inform a team’s grade- or department-level assessment work. Unfortunately, because many benchmark or interim assessments are patterned after the high-stakes assessments they feed into, they also fall short of helping teaching teams respond to their data in instructionally agile ways.

Depending on how they are developed and used, interim assessments can make a positive difference in student achievement. Former teacher and administrator, current leadership coach, and author Kim Marshall (2008) highlights the reasons that data from such assessments add value to the organization that employs them:

• Interim assessments can monitor growth from the beginning of a term to the end

• Interim assessments can be more encompassing and require learners to put knowledge and skills together in rigorous and diverse ways

• The results of interim assessments can be generalizable and visible, helping all stakeholder groups engage in analysis and discussion

• Cumulative interim assessments can help track student progress over time

• Results provide opportunities for support systems to be introduced to help both students and teachers

• Results help administrators understand the full picture of how things are going in their building. (p. 68)

It is a given that such assessments can be helpful for program data, especially in larger districts with multiple schools. And it is clear that the use of common assessments as interim or benchmark assessments can have a positive impact. The operative word, however, is can. More research is needed regarding the effectiveness of interim assessments. Studies are not yet clear what makes some interim assessments work better than others, which types of assessments work best, whether or not it matters who creates the assessments, and how the test design information is shared or not shared. For example, benchmark or interim assessments are developed and implemented in a variety of different ways.

• Schools, districts, or boards purchase predeveloped testing tools from outside testing companies, such as test item banks, online testing systems, or packaged curriculum- based assessments.

• Districts or boards—often those large enough to house their own assessment division—create their own assessments and strive to adhere to the strictest of standards regarding test validity and reliability.

• Districts, consortiums of districts, or boards bring highly respected teachers together to represent their peers and develop end-of-course or end-of-year assessments. In this case, the selected teachers are advised to return to their schools with generalities but not specifics about the assessments for fear that teachers will teach to the test.

• Districts, consortiums of districts, or boards invite teaching teams to write their own common assessments and then forward those assessments to the department leads or chairs who bring them to the district level where blending and integration processes begin to happen so all of the schools have input, but a shared set of common assessments emerges.

• Districts, consortiums of districts, or boards invite teaching teams to write their own common assessments and submit them for review and approval. In this case, the administrators generally monitor the consistency of the delivery system.

The assessment purpose (formative or summative) is often misaligned as well. In most cases, administrators will tell teachers that their benchmark, interim, or progress monitoring results are meant to be formative in nature, and teachers are advised to respond to the data accordingly to alter student success rates over time. More often than not, such assessments end up being summative in nature because of how the results are managed at the classroom level. Even if instruction is altered or additional support is provided, students are sometimes held accountable to all of the scores they generated along the way. Data are used for decision making, but teachers are often marginalized in how much freedom they have to interact with and respond to the results in a timely and effective manner. Worse, students, the primary decision makers when it comes to determining their own success, are often handicapped with pass/fail data that highlight areas of deficit, distort the reality of the specific gaps in understanding or skill, and minimize the assets they bring to the re-engagement process.

Moreover, the use of outside vendors’ ready-made tests rarely matches the specific demands of the standards. Speed, ease, finances, and an undeniable urge to pattern local interim assessments after large-scale national or state assessments have dictated that such assessments be measured via bubble sheets, an immediate misfire when it comes to measuring what matters. While previous and new versions of educational standards have been performance based, the selected assessments have not assessed at the levels of mastery required by the standards themselves. The things that the majority of colleges, businesses, parents, and entire countries value most—multidimensional problem solving, ethical decision making, and inventing or creating—cannot be measured in bubble sheets. The data that are gathered through such assessments have been predominantly based in content knowledge.

District- or board-level testing is important in designing and supporting an aligned assessment system. Such assessment systems set the standard for internal expectations and guarantee the readiness of their learners for the greater national expectations. As it stands, however, district testing has followed a pattern that educators themselves distrust and dislike. Done well, district testing can make a difference in leading the way to a better testing system.

Assessments at the Building or Team Levels

In a far better model than simply relying on ready-made external options, districts engage teams in courageous conversations about reaching higher and then support them in developing the assessment literacy required to make their vision a reality. With focus, commitment, and drive, educators can create a better testing system. But it will take everyone—from all levels of the organization, to all states or provinces participating—to support that effort. Recognized for her work in leading assessment literacy, dean and distinguished professor at the University of Colorado Boulder, Lorrie Shepard (2013) writes:

The hope, too, is that next-generation assessments will more faithfully represent the new standards than has been the case for large-scale assessments in the past. While the knowledge exists to make it possible to construct much more inventive assessments, this could more easily be done in the context of small-scale curriculum projects than for large-scale, high-stakes accountability tests. (p. xxi)

Developing such high-quality assessments will not happen overnight. Current large-scale assessment designs have established a pattern of testing that makes teachers leery of taking the risks involved with designing alternative assessments. In addition, collectively, teachers have not had the necessary training or experience with designing accurate assessments at rigorous levels and then extrapolating meaningful learning from the results (Stiggins & Herrick, 2007). The good news is that engaging teachers in learning teams to function as assessment architects can build the necessary assessment literacy faster than any other professional development alternative. However, teams must engage in the full range of the assessment process with regularity and in collaboration: “Groups of teachers jointly analyzing what’s on the test, what’s not, and how to stay true to more complete learning goals creates both greater awareness and a shared commitment to avoid narrow teaching to the test” (Shepard, 2013, p. xxi).

Shepard (2013) states, “Teachers need access to better tools, not disconnected item banks but rather curriculum tasks that have been carefully designed to elicit student thinking and for which colleagues and curriculum experts have identified and tested out follow-up strategies” (p. xxi). In the absence of practicing skills, developing a clear rationale, and accessing better tools for developing assessment literacy, teachers will default to testing designs and teaching practices that aim solely at the specific test questions in a manner that elicits recall-based responses.

Leaders oversimplify the complexities of assessment design and use when they buy ready-made solutions. This process opts teachers out of truly understanding the what, why, and how of assessment design and use. Schneider et al. (2013) state:

To maximize student achievement, teachers and large-scale assessment developers need to (1) have the same interpretations of the standards, (2) identify the same types of student achievement as evidence of mastery of the standards, and (3) collect evidence using the same types of robust practices in building assessments. (p. 55)

This type of learning cannot be managed through shared documents outlining expectations. Instead, teachers must learn by doing.

In all of their writings, the Professional Learning Community at Work architects DuFour, DuFour, and Eaker have advocated for teachers engaging in the work of common assessments to improve practice at the classroom level (DuFour et al., 2006, 2008; DuFour, DuFour, Eaker, & Many, 2010).

They challenge the premise that outside testing would ever suffice to support teachers in classroom practice:

The challenge for schools then is to provide each teacher with the most powerful and authentic information in a timely manner so that it can impact his or her professional practices in ways that enhance student learning…. State and provincial assessments fail to provide such feedback. Classroom assessments, on the other hand, can offer the timely feedback teachers need, and when those assessments are developed by a collaborative team of teachers, they also offer a basis of comparison that is essential for informing professional practice. (DuFour et al., 2006, p. 147)

Common assessments are integral to the work of professional learning teams. Highly effective collaborative teams focus their energies on addressing the instructional concerns for their classrooms.

Whether functioning as professional learning communities or not, effective teams address the four corollary questions outlined by PLC experts DuFour et al. (2010).

1. What do students need to know and be able to do?

2. How will we know when they have learned it and can do it?

3. How will we respond when students don’t learn it?

4. How will we respond when they already know it?

DuFour, DuFour, and Eaker consistently assert that unless teams are doing the work of common assessments, they are not truly functioning as a PLC (DuFour et al., 2006, 2008, 2010). Effective teams use their own data and evidence to adjust, improve, and inform their

practice. All four of the corollary questions link directly to the work of collaborative common assessments. Table 2.1 provides the links between each of the corollary questions and its direct connection to the work of collaborative common assessments.

Table 2.1: Corollary Questions and Collaborative Common Assessments

Corollary Questions of Effective Teaching Teams	Connection Between the Question and the Practice of Common Assessments
1. What do students need to know and be able to do?	Effective teams identify the essential knowledge and skill expectations for their learners based on required standards and in advance of any instruction. Teams backmap their assessment plans to align with their standard expectations (see figures 1.3 and 1.4 in chapter 1 as an example). Valid and reliable common assessments are contingent upon a team’s ability to develop congruence with required expectations that are answered by corollary question 1.
2. How will we know when they have learned it and can do it?	Teaching teams can only answer this question through the work of common assessments. When teachers review their data in isolation, they frame their experiences and opinions, but the variables that lead to their results cannot be compared in a manner that helps them create information regarding what works and what doesn’t work instructionally. Data can only provide information when reviewed in comparative ways against a valid benchmark; otherwise, they are simply random data points. Common assessments provide teams with the evidence needed to help teams answer corollary question 2. Collaborative common assessments are the engine of a PLC because they can drive teams to make more informed decisions regarding their practice.
3. How will we respond when students don’t learn it?	Teams require the data and evidence generated from common assessments to answer corollary question 3. Reflection and analysis regarding their individual and collective results combined with collaborative problem solving provide the only means to help teams find the best way to target exact learning needs and demystify complex learning issues.
4. How will we respond when they already know it?	Enrichment, extension, and advancement are proving harder to address than interventions. In all of these activities, educators must help learners who have mastered content and skills to extend their learning. Enrichment does not mean doing more work, helping others to learn something they have not yet mastered, or moving to the next chapter. When teams design their common assessment products and processes, they plan for what a true enrichment might look like—one that is engaging and fun while building upon current learning targets that have been newly mastered in challenging ways. When teams design the enrichments in advance of instruction, they can increase motivation and understanding in the following ways.• They clarify even further their own understanding (and that of their learners) of what mastery will need to look like.• They pique interest in advance of instruction by showing learners the possibilities that lie before them if they master the expectations in a timely manner.

Visit go.solution-tree.com/assessment for a reproducible version of this table.

Team-based common assessments are a critical component of the assessment system. They provide the medium for rich discussion and a pathway into building assessment literacy in a manner that enhances teaching and learning experiences for everyone involved.

Assessments at the Classroom Level

In figure 2.1 (page 25), classroom assessments are pictured at the bottom of the image. This does not mean they are the least important; on the contrary, classroom assessments provide some of the most significant tools, data, and evidence that schools have at their disposal to positively impact student achievement. In the SAGE Handbook of Research on Classroom Assessment, McMillan (2013b) states, “Our collective assertion is that CA [classroom assessment] is the most powerful type of measurement in education that influences student learning” (p. 4). Classroom assessments provide the bedrock of the entire assessment system. An expert in special education and disability policy, Professor Yaoying Xu (2013) observes that “CA can be defined as a process of collecting, evaluating, and using information gathered before, during, or after instruction to help the classroom teacher make decisions on improving student learning” (p. 431). Collaborative common assessments, then, should be designed at the classroom level, leading the team to collective and individual success with clarity in focus, consistency in application, accuracy in interpretation, and equity in responses.

Experts in the field of formative assessment have shared research that classroom assessments—those that are closest to the learners and the learning—provide the best vehicle for supporting learning progressions and certifying mastery (Black & Wiliam, 1998; Chappuis, 2009; Chappuis et al., 2012; Hattie, 2009, 2012; Hattie & Timperley, 2007; Heritage, 2010; Wiliam, 2011; Wiliam & Thompson, 2007). Assessment researchers and authors Rick Stiggins and Mike Herrick (2007) state that:

Average score gains of a full standard deviation and more have been attributed to the effective use of classroom assessment to support day-to-day student learning, with a major portion of such gains attributable to the continuous delivery to students of accurate descriptive feedback arising from high-quality classroom assessments. (p. 1)

Only at the classroom level can teachers tap into the top strategies that support student learning: clarity of learning expectations, clarity of criteria for quality, and descriptive feedback.

However, classroom assessment can only be as accurate and powerful as the knowledge and skill base of the individual teacher running the classroom. To date, international experts who contributed to the SAGE anthology of research on classroom assessment (McMillan, 2013b) consistently claim that current practices with classroom assessments have missed the mark on the following critical features.

• Using formative processes and tools to promote success on summative indicators

• Constructing assessments that accurately measure what matters

• Gathering relevant data

• Drawing accurate and meaningful inferences with data and evidence

• Diagnosing learning strengths and weaknesses for instructional implications

• Providing feedback that reduces discrepancies based on errors

• Building a sense of hope and efficacy in learners based on results and future opportunities

• Engaging learners as partners in the journey

It has become commonplace for nationally recognized experts to call for a redefinition and better understanding of the practice of classroom assessment. McMillan (2013b) defines classroom assessment as “a broad and evolving conceptualization” (p. 4) that involves both teachers and learners taking an active role in gathering and using data as a means to diagnose strengths and weaknesses, to set goals, to monitor proficiency levels, and to communicate about performance. As a decision-making tool, assessments must gather relevant data that can lead to healthy and accurate inferences regarding what students know and can do in regard to the standards at hand. He notes that the emphasis behind classroom assessment must change in that classroom assessment becomes “a vehicle through which student learning and motivation are enhanced” (p. 4).

Teams immersed in the work of exploring accurate assessment design and effective assessment use together can better develop their individual and collective assessment literacy (Chappuis, Chappuis, & Stiggins, 2009; Shepard, 2013). Learning by doing is powerful. Shepard (2013) states:

Подняться наверх