Do Your Assessment Scores Mean What You Think They Mean?

The question this answers:

 

What does a “3 out of 5” assessment score actually mean?

 

What the problem looks like without rating scales with plain-language descriptors

 

Your assessors score applications on a 1-5 scale. One assessor thinks 3 means “adequate.” Another thinks 3 means “good.” A third never gives 5s because “there’s always room for improvement.”

The same application, assessed by different people, gets wildly different scores. Moderation becomes a negotiation about what numbers mean rather than a discussion about application quality.

The scale exists. Shared understanding of the scale doesn’t.

 

What I deliver is

 

A scoring scale with plain-language descriptors for each level, specific to each criterion. That means:

  • What each number means: Described in terms of the evidence an application would show

  • Anchored to observable features: Not “good” or “excellent” but specific indicators

  • Consistent across criteria: The same rigour applies whether you’re scoring community benefit or value for money

Delivered as a reference document assessors can use during scoring, and included in assessor training materials.

What good looks like vs what bad looks like

 

Bad:

ScoreMeaning
1Poor
2Below average
3Average
4Good
5Excellent

 

This tells assessors nothing. “Good” according to whom? “Average” compared to what?

 

Good:

Criterion: Community benefit

ScoreDescriptorEvidence you’d expect to see
1No demonstrated benefitNo evidence of community need; unclear who benefits; benefits are speculative or assumed
2Limited benefitSome evidence of need but not specific; beneficiaries vaguely defined; benefits not clearly linked to activities
3Adequate benefitNeed is identified with some evidence; beneficiaries are defined; benefits are plausible but not strongly demonstrated
4Strong benefitClear evidence of need (data, consultation, letters of support); specific beneficiaries identified; benefits clearly linked to project activities
5Exceptional benefitCompelling evidence of significant need; well-defined beneficiaries with demonstrated engagement; benefits are substantial, clearly articulated, and highly aligned with program objectives

 

Now assessors can look at an application and match it to a descriptor, not guess at a number.

 

Why it matters

 

Consistency depends on shared understanding. If two assessors have different mental models of what “3” means, calibration is impossible.

Plain-language descriptors anchor the scale to observable evidence. They reduce subjectivity. They make moderation conversations about applications rather than about numbers. And they make it possible to explain to an unsuccessful applicant exactly what their application was missing.

Other Assessment Design Deliverables

 

Does Your Assessment Framework Pick the Right Applications? → Assessment criteria engineered backwards from program intent. Every criterion exists because a funding decision depends on it. Weightings and decision logic are structural, not advisory. The framework makes the decision architecture visible so assessors execute programme logic rather than substitute their own.

 

Are Your Panel Processes Protecting the Program Or Exposing It? → A decision architecture for panels. Who decides what, on what basis, with what constraints, and what gets recorded. Designed so the process produces defensible outcomes by structure, not by relying on experienced panellists to compensate for missing design.

more Deliverables