Unit 3 - Assessment Methodology

In Unit 2, we looked at the different kinds of assessment in schools.  The focus in this unit is mainly on summative assessment. However, if formative assessment is to be effective, some of the issues below need to be considered here too.

Remember, summative assessment or assessment of learning (AoL):

  • Takes place after the learning
  • Focuses on pupils’ achievements
  • Is used to provide feedback to parents/carers based on performance evidence

Our Assessment: Keeping Learning on Track publication contains some useful information on this.

We have also looked at how summative assessment may be:

  • periodic (internal to the school, at intervals defined by the school) or
  • national statutory summative assessment and transition (e.g. national testing and reported teacher assessment, end of Key Stage assessment)

Activity 1 - Problems with assessment

Are the following statements heard in your school?

  1. We don’t teach that as it never comes up in the tests…. Summative assessment only covers a small part of all the things pupils have learned
  2. I had two children come up to my class this year, who were noted as being on track for their age, but this turned out to be because of all the help the teaching assistant had given them.
  3. Teachers never agree on what the work shows. Tests are more accurate than teacher assessment.

For each statement, reflect on how you would respond if they came up in a staffroom discussion.  Do you agree with the statements?  What other factors are at play?

The issues raised in the statements relate to two important concepts: validity and reliability.  We will explore them and return to the questions later.

Activity 2 - Understanding Validity and Reliability

The following examples to illustrate validity and reliability are given on an assessment website developed in partnership between the Pinellas School District and the Florida Center for Instructional Technology at USF:


Reliability refers to the extent to which assessments are consistent… If you weigh five pounds of potatoes in the morning, and the scale is reliable, the same scale should register five pounds for the potatoes an hour later (unless, of course, you peeled and cooked them).


Validity refers to the accuracy of an assessment -- whether or not it measures what it is supposed to measure.  Even if a test is reliable, it may not provide a valid measure.  Let’s imagine a bathroom scale that consistently tells you that you weigh 130 pounds.  The reliability (consistency) of this scale is very good, but it is not accurate (valid) because you actually weigh 145 pounds…

In the context of education, the Assessment Reform Group in The role of teachers in the assessment of learning states that:

"In order to be valid and to measure what it is supposed to measure, summative assessment “must cover all aspects, and only those aspects, of pupils’ achievement relevant to a particular purpose”.  If it is measuring only part of this, it may give consistent answers but they will not be a true reflection of pupils’ achievements.

In order to be reliable, “it should be designed so that users can have confidence that the results are sufficiently accurate and consistent for their purpose”.

So, can you explain the difference between reliability and validity in your own words?  How would you explain them to a colleague using examples from your own experience?

Compare your explanations and understanding with Dylan Wiliam's thoughts on the subject shared on page 4 of the publication Reliability, Validity and All That Jazz.

There are a number of factors that affect validity.  Some of these are noted in this video from a course in Validity in Assessments from Study.com:

"…it is important to understand how external and internal factors impact validity.

A student's reading ability can have an impact on the validity of an assessment.  For example, if a student has a hard time comprehending what a question is asking, a test will not be an accurate assessment of what the student truly knows about a subject.  Educators should ensure that an assessment is at the correct reading level of the student.

Student self-efficacy can also impact validity of an assessment.  If students have low self-efficacy, or beliefs about their abilities in the particular area they are being tested in, they will typically perform lower.  Their own doubts hinder their ability to accurately demonstrate knowledge and comprehension.

Student test anxiety level is also a factor to be aware of.  Students with high test anxiety will underperform due to emotional and physiological factors, such as upset stomach, sweating, and increased heart rate, which leads to a misrepresentation of student knowledge."

Activity 3 - What do we think now?

Remember those three statements from Activity 1?  Re-read them to remind yourself of your thinking and then listen to what we thought.

    • We don’t teach that as it never comes up in the tests…. Summative assessment only covers a small part of all the things pupils have learned


The teacher's role in assessment of learning - Assessment Reform Group

    • I had two children come up to my class this year, who were noted as being on track for their age, but this turned out to be because of all the help the teaching assistant had given them.


    • Teachers never agree on what the work shows.  Tests are more accurate than teacher assessment.



Has what you have learnt about validity and reliably changed or enhanced your thinking?

Activity 4 - How does this relate to Formative Assessment?

Consider this observation schedule for Assessment for Learning:

  • Which of the key features in this schedule include aspects of validity?
  • What might be the pitfalls if validity is not considered?

For the first question, anything which helps to clarify learning objectives and success criteria and clarifies expected standard will help to make the work more reflective of the full range of the task.  While the pitfall of not considering validity might be that the assessment might not measure what it claims to measure.

Many teachers want to lessen the impact of summative assessment on teaching and learning. Ideally with a partner, look at this list which is taken from Testing, motivation and learning from the Assessment Reform Group.  For your school, could there be a focus on developing further the “do more of this” and cutting down on the “do less of this”?  From the information in the unit so far, how do these relate to issues of validity?  Make a note of your thoughts.


Unit 4

Return to introduction