Skip to content
Aurora Institute

Threshold Concept: Assessment Literacy

CompetencyWorks Blog

Author(s): Susan Patrick, Maria Worthen, Natalie Truong, Dale Frost

Issue(s): Issues in Practice, Rethink Instruction

Courtesy of Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action

This is the nineteenth article in a series leading up to the National Summit on K-12 Competency-Based Education. We are focusing on four key areas: equity, quality, meeting students where they are, and policy. (Learn more about the Summit here.) We released a series of draft papers in early June to begin addressing these issues. This article is adapted from Fit for Purpose: Taking the Long View on Systems Change and Policy to Support Competency Education. It is important to remember that all of these ideas can be further developed, revised, or combined – the papers are only a starting point for introducing these key issues and driving discussions at the Summit. We would love to hear your comments on which ideas are strong, which are wrong, and what might be missing.

“Student assessment is essential to measure the progress and performance of individual students, plan further steps for the improvement of teaching and learning, and share information with relevant stakeholders.” – OECD, 2013

Assessment literacy is important for practitioners but it is also important for policymakers and stakeholders throughout the system to understand the roles that different types of assessment play in student learning, how assessment and moderation are used to comparatively and fairly judge student mastery, and how the information generated by assessments can be used toward a cycle of continuous improvement in teaching and learning. The lack of assessment literacy across the system is a major blind spot. Thus, building significant capacity for assessment literacy is needed to advance new competency-based approaches and address tough issues in our current system.

An important concept in assessment today is related to the concept of comparability. Comparability is defined as the degree to which the results of assessments intended to measure the same learning targets produce the same or similar results. This involves documenting the reliability of judgments and not assuming that comparability is stable over time or invariant across multiple subgroups such as English language learners and special education students.1

There are unique circumstances in the U.S. education system that have driven the need for much greater degrees of comparability than is true in most other nations. When the federal government became involved in K-12 education with the Elementary and Secondary Education Act of 1965, it was in direct response to deep inequities that remained even after school segregation. Because of the history of inequities in education offerings among student groups, concerns for equity are much greater than in many other countries, which drives, to a significant extent, the degree to which we need to take greater care that measures are fair and have common meaning among students, schools, and districts.2 This drives the prevalence of standardized tests in our country, causing the concept of assessment to often be conflated with the end-of-year, statewide, summative accountability tests.

Practitioners working deeply in competency-based learning models realize quickly how our K-12 education systems lack systems for calibrating the quality of student work, so we know that fundamentally there is significant consistency across schools and systems. As much of a systems challenge as this would appear across the states in the U.S. today, building professional educator capacity and policymakers’ understanding of assessment literacy is fundamental to shifting to personalized, competency-based systems at scale and focusing on equity.

A common misconception about assessment literacy is that it is only about how to interpret standardized test results. In contrast, assessment literacy is a much broader and more significant concept. The New Zealand Ministry of Education defines assessment literacy as:

“the possession of knowledge about the basic principles of sound assessment practice, including its terminology, the development and use of assessment methodologies and techniques, and familiarity with standards of quality in assessment. The primary purpose of assessment is to improve students’ learning, as both student and teacher respond to the information that it provides. Information is needed about what knowledge, understanding, or skills students need. By finding out what students currently know, understand, and can do, any gap between the two can be made apparent. Assessment is the process of gaining information about the gap, and learning is about attempts to reduce the gap.”

Personalized, competency based learning requires us to reorganize systems around doing what it takes to ensure every student is attaining mastery, rather than the ranking and sorting them into high achievers and low achievers that is created through variable A-F grading practices. Redesigned systems will need to build capacity for clear evaluation criteria to make valid and reliable comparisons of students’ progress against outcomes (commonly understood outcomes) using evidence and common rubrics.

Thus, progress isn’t measured by ranking and sorting kids against each other, or through grading “curves,” but instead for each student to measure their evidence against articulated, high-level, common expectations of success and with clear depictions for what success looks like. This process of developing clear expectations for common proficiency levels is a key part of a “calibration.” Calibration is a process that allows two or more things to be compared via a common standard (e.g., a weight in the physical sciences or commonly scored papers in an education system). The purpose of common performance tasks given to students by different schools and districts is to serve as a “calibration weight;” a way to compare the way one school or district scores students on the common task, with the way other schools and districts score those same students’ work. In order to use the common performance tasks as calibration weights, districts need to re-score other districts’ common performance tasks. Calibrating expectations as well as grading and scoring processes for learning goals, is very important in competency-based learning systems. Calibration may involve groups of educators who collaborate and develop consensus around rubrics for scoring student work. The calibration process makes scoring student work consistent and more aligned to the standards upon which rubrics and scoring criteria are based, as well as creating reflective processes focused on improving student learning.

In addition to calibration processes for consistently and accurately evaluating student work, assessment literacy also includes knowing which assessments are appropriate for what purpose (e.g., formative, progress monitoring, or summative). This idea of common expectations, and evaluating evidence against common standards and rubrics to build and evaluate comparability across schools and systems, requires careful  moderation of assessment practices across the system and perhaps across the state level. Professional development of educators to assess student evidence using calibration processes and developing rubrics with scales for evaluating performance tasks against criteria, is central to building the capacity needed in a competency-based education system. A competency-based learning system that offers personalized pathways for students to meet learning goals and learning targets must rely on multiple forms of evidence against common standards and expectations.  

Tackling Assessment Literacy in Policy: Balanced Systems of Assessments

Assessment is integral to the process of teaching and learning. Teachers should be constantly checking for their students’ understanding in formal and informal ways. They are checking for understanding with formative assessment, tracking progress with interim assessment, and checking mastery of standards with summative assessment. And yet, “assessment” today in the United States is often used as a shorthand term for, or conflated to mean, “statewide accountability test.”

To be clear, though intricately linked to each other in today’s policy context, accountability and assessment are two separate concepts. We should examine our approach to policy regarding assessment. We can be very clear about the need to measure student learning and growth in valid, reliable, and comparable ways, while also opening up new approaches to assessment that support, rather than disrupt, the learning process. Reflecting on where we are in the United States, if assessment is conflated with accountability today, it is because our policies have been structured to do just that. Counter to some of the narratives that are dominating policy conversations today, assessment and learning need not be at odds with each other. Policy can and should help to drive coherence of K-12 education systems by ensuring that assessment, teaching, and learning are complementary and supportive of one another.

Differentiating Between Assessment and Accountability

It is common today in U.S. education policy to see the terms assessment and accountability used interchangeably. This conflates a broad set of tools that generate information about student learning (assessments) with policy initiatives designed to incent desired behaviors, or disincent undesired behaviors in order to reach specific goals (accountability). Of course, accountability and assessment are linked concepts, to the extent that assessment provides data that can be used for accountability. However, problems arise when the goals in the accountability system are too narrowly defined and the incentives or disincentives are too limiting or too punitive. NCLB tied a single assessment (end of year summative state tests) to multiple high stakes (identifying schools for intervention, diverting their federal funds into proscribed uses, and, with the changes brought about under the ESEA waivers and Race to the Top, teacher evaluations sometimes used to make human resources decisions). So, it makes sense that accountability and assessment get confused with each other. A critical shift in thinking needs to happen around accountability and assessment, starting with accountability systems based in multiple measures that move the focus from performance on a single test, to multiple measures aligned with the profile of a graduate, and accountability that balances incentives/disincentives with supports.]

To start, policymakers could begin to think about assessment in terms of systems of assessments that serve multiple purposes for multiple stakeholders, rather than in terms of a single assessment that is designed to be used solely for accountability and has the end result of driving teaching and learning toward limited outcomes.

Chattergoon and Marion (2016)3 argue that as states redesign their approaches to assessment, they should pursue balanced systems of assessments that meet the following three criteria:

  • Coherent systems: “The assessments in a system must be compatible with the models of how students learn content and skills over time;” and “curriculum, instruction, and assessment must be aligned to ensure that the entire system is working toward a common set of learning goals;”
  • A well-articulated theory of action that articulates how each part of the system relates to the others. In other words, what purpose does the system as a whole serve, what different needs does it meet for different stakeholders, and how does it meet them? “A set of assessments, even if they cohere, will not fulfill the intended purposes if the information never reaches the intended user;” and,
  • Assessment efficiency means that systems are providing stakeholders with the full range of information that it is intended to provide. “For example, if a state wants to give educators information to help them adjust instruction, its assessments must be tied to the curriculum that is being used. These assessments should in turn yield timely, detailed information about the knowledge and skills being assessed at the local level.”

What does this look like in practice? The policy constraints under No Child Left Behind, the federal K-12 education law that predated the Every Student Succeeds Act of 2015 (ESSA), as well as the local control of curriculum in most states, have made it challenging for states to produce comprehensive statewide models for balanced systems of assessments. States could take a leadership role working with districts and schools to set conditions for more balanced systems of assessments, with multiple measures, aligned to student-centered learning, to identify what specific data the state needs for accountability. There are states beginning to move in this direction and further along with this work in new systems of assessments, such as in the Assessment for Learning project, that provide examples of pathways in state policy and systems “advancing our understanding of assessments’ essential roles in the learning process, as learning models become more personalized, less cohort-restricted, more competency-based, and student-centered.”4

Perhaps the best example of a statewide approach, the State of New Hampshire’s work on competency-based systems has been underway for more than two decades. In New Hampshire, the Performance Assessment for Competency Education (PACE) system is currently being piloted in a subset of districts across the state and offers a more comprehensive state system of assessments that the New Hampshire Department of Education describes as:

a learning system designed to capitalize on the latest advances in understanding of how people learn. The goal is to structure learning opportunities that allow students to grapple with gaining meaningful knowledge and skills at a depth of understanding that they can transfer to new real-world situations. As a coherent system, NH PACE is designed to foster positive organizational learning and change by supporting the internally-driven motivation of educators instead of the all-too-common top-down accountability approaches where the goals and methods of the accountability system are defined at the state or federal levels and districts are simply expected to comply.5

As this description shows, New Hampshire is taking a future-focused approach to assessment, thinking about it as an integral support for teaching, learning, and building local and teacher capacity.

The Every Student Succeeds Act opens up some significant new opportunities for states to rethink assessment. Section 1204, the Innovative Accountability and Assessment Demonstration Authority, allows states to ask for permission to pilot innovative systems of assessments in a subset of districts. States and districts participating in the pilot would be able to use determinations from these new systems of assessments for accountability purposes. This pilot offers an opportunity for states to intentionally focus on building next generation systems of assessments. It facilitates this by allowing states to pilot new systems of assessments in a subset of districts to eventually scale across districts statewide. States participating in the demonstration authority could pilot performance assessments, developing educator capacity for assessment literacy and moderation practices. Consortia of districts could work together to catalyze state leadership to move forward with innovative models of assessments across within their state and nation-wide).


How do we balance quality assurance, accountability, transparency, validity and equity with responsive learner-centered designs? How do we make significant changes for continuous improvement over time that will build capacity at all levels? Are we building educator capacity and professionalism? How do we manage local and community needs? How do we think about designing for every student’s success and build consensus among all stakeholders? How do we improve transparency of outcomes with holistic approaches?

An additional major core concept is the fundamental need to build capacity, trust and professionalism toward a powerful idea of “reciprocal accountability.”  

In Bridging the Gap Between Standards and Achievement, Harvard Professor Richard Elmore explains:

Accountability must be a reciprocal process. For every increment of performance I demand from you, I have an equal responsibility to provide you with the capacity to meet that expectation. Likewise, for every investment you make in my skill and knowledge, I have a reciprocal responsibility to demonstrate some new increment in performance. This is the principle of “reciprocity of accountability for capacity.”6

A major concept missing in our current approaches to accountability is to consider what it would look like if we had a system designed to build trust. Typical state accountability systems are put in place by rigidly grouping students by age cohorts at each grade level to better ensure data quality and comparability against the same test. But the unintended retrograde consequences resulting from this time-based model of accountability may inhibit educators from evidence-based practices for meeting students where they are. If we are not constantly assessing where students are, meeting them where they are, and addressing gaps to provide supports and accelerate learning at high levels, will we ever begin to advance true equity across the system or be able to provide responsive pedagogical approaches?

Tackling Accountability as Continuous Improvement in Policy: Next Generation Accountability Models

As we mentioned, the idea of accountability in American education has become synonymous with end-of-year, statewide, summative tests that are tied to high stakes outcomes for teachers and schools. NCLB’s intent was to increase equity. The strategy NCLB employed was to require states to test all students in grades 3 through 8 and once in high school in math and reading/language arts, and reporting the percentage of students who were proficient on grade level standards. These data were the main focus of NCLB’s accountability model. Schools were required to make “Adequate Yearly Progress” (AYP) toward a goal of 100 percent proficiency (in every subject and subgroup) by 2014. Schools were subject to increasingly punitive sanctions for each year that they did not make AYP. The effect of NCLB’s strategy to increase equity in education through a singular focus on grade-level proficiency tests has been a conflation of equity with the same test for the same age student to measure grade-level proficiency on reading and math. In the process of developing the new federal education policy, the Every Student Succeeds Act, conversations about “guard rails” in accountability for equity were centered on ensuring all students were being tested using the same end-of-year, single, summative test containing only items from a student’s assigned grade level based on age.

We have discussed the need for better assessments that more accurately show student proficiency, growth, and application of knowledge and skills. Now, we need to unpack the concept of accountability, completely reimagining it as a tool for transparency to support and empower rapid and constant improvement in learning toward a more comprehensive definition of success.

Policymakers should consider engaging stakeholders to think through what communities and the state need in terms of a “Profile of a Graduate” and what that means for next generation accountability systems to ensure students are being prepared for success in postsecondary education, the workforce, and civic life.

We need to ensure that accountability systems are “fit for purpose” and support student-centered, competency-based learning. How could we know in real time whether students are making progress on developing the skills they need? How do we know how much progress is being made against the graduation goals and what resources are needed to support their learning? What would it take to create a policy environment that actively encourages a growth mindset?

Education systems should reflect families’ and communities’ hopes for student success in school, work, life, and society. We imagine an accountability system that empowers stakeholders with the information they need to help students succeed. Systems should provide a complete picture of students’ successes and challenges, providing the right information to the right stakeholders at the right time. We need to bolster communities in and around schools to have more input on student learning and shared ownership of student outcomes. Policy could catalyze the creation of accountability systems built around ensuring all educators and schools can give students the supports they need to master the knowledge and skills necessary for success. State and local education systems need to focus on supporting an accountability system that is iterative to constant improvement and innovative over time to meet the needs of a changing society, economy, and student populations. Information from systems of assessments should help to inform and improve school and educator practice and capacity and help to move students toward their next learning goals and beyond.

There are a number of important considerations that policymakers might keep in mind as they think long term about accountability redesign. These include:

  • Engaging diverse, local, and state stakeholders to redefine success and ensure that the goals, measures, and systems are all working together to support each student’s success;
  • Identifying how each level of the system can keep “skin in the game” so that accountability is shared and does not fall disproportionately on the shoulders of any one stakeholder group, and so that it encourages collaboration;
  • Thinking about school quality reviews and interventions as part of a process of continuous improvement;
  • Thinking about systems as dynamic and responsive to stakeholders. Under ESSA, states can request to amend their accountability plans at any time. As states learn what works, or doesn’t work, they may make changes in the spirit of innovations for equity and continuous improvement.
  • Providing timely information to the right stakeholders, at the right levels, at the right time, and recognizing that the same data can be aggregated or disaggregated to meet different needs;
  • Considering how to present multiple measures of student learning and school quality with advanced data visualization, to provide families with rich, easy to understand information;
  • Embedding professional learning into quality improvement processes;
  • Considering the inputs, processes, and outcomes that reflect a relentless and multi-faceted pursuit of equity for students;
  • In considering student learning outcomes, thinking differently about the concepts of “proficiency” and “growth” and how we can monitor student learning in real-time, so that educators can intervene quickly to fill in gaps or meet other needs as they arise; and,
  • Investing in the requisite educator and leader capacity

We need to move from thinking about measuring one point of proficiency at one point in time, to understanding the transparency of data with student proficiency every day as well as each student’s growth over time. We need more advanced quality assurance, evaluation and assessment approaches to provide ongoing transparency of student progress. With better data, data literacy, and the requisite investments in educator capacity, we could evaluate proficiency, achievement gaps, rate of progress and also understand growth based on individual student growth over time; we could also look across cohorts of students and disaggregate data by sub-group to ensure equity and transparency with a depth not possible today.

Under ESSA, states are no longer required to “rank and punish” schools on a single, end-of-year summative determination of grade-level proficiency. Rather, states may now use multiple measures of academic achievement, graduation, and performance of individual student subgroups, as well as a measure of school quality, to identify schools for improvement. States may use student growth, extended-year cohort graduation rates and additional metrics of their choosing. With multiple measures, the opportunity is there for states to redesign accountability around a broader definition of student success. After all, while the abilities to read and do math are important, students will need to be equipped with other skills, such as critical thinking, communication and collaboration, to be successful after high school.

Policymakers in other states could consider the example of Vermont, which has created an accountability system designed to foster continuous improvement, both for school systems and for student learning. What most distinguishes Vermont is that every school has been identified for improvement. This removes the “black eye” of the improvement label, and puts each school and district in a mindset of continuous improvement. The state’s Education Quality Standards require schools to submit Continuous Improvement Plans that outline the school’s accomplishments, progress, goals and strategies for improvement. All Continuous Improvement Plans are reviewed by the Vermont Agency of Education staff. However, the in-person monitoring visits are carried out by Vermont educators in a peer-review process. With peers providing feedback to the schools on their performance, the recommendations for improvement become more meaningful and feel lower-stakes. Interventions are required for schools not meeting quality standards; since all schools are reviewed and continuous improvement is the goal for every school, interventions become more differentiated based on each one’s characteristics and capacity needs.

Follow this blog series:



Meeting Students Where They Are


Learn more:


1Correspondence with Scott Marion, Center for Assessment, May 5, 2017.

2Randy, E. Bennett. Opt out: An Examination of Issues (Research Report No. RR-16-13). Princeton, NJ: Educational Testing Service, 2016.

3Rajendra Chattergood, & Scott Marion. Not as Easy as It Sounds: Designing a Balanced Assessment System. National Association for State Boards of Education, January 2016.

4Assessment for Learning Project. Next Generation Learning Challenges, March 2013.

5New Hampshire Department of Education. Moving from Good to Great in New Hampshire: Performance Assessment of Competency Education (PACE), January 2016.

6Richard Elmore. Bridging the Gap Between Standards and Achievement. Albert Shanker Institute. 2002.