Method Validation and Verification under ISO 15189:2022: A Practical Guide for Biomedical Scientists
Before any new analyser, assay or reagent kit produces a single patient result, the laboratory must prove the method works as intended in its own hands. Getting this wrong is a patient-safety problem, not a paperwork one, because every downstream clinical decision assumes the result is correct. This guide explains the difference between validation and verification, the performance characteristics you must study, and how to design and document each study so it satisfies ISO 15189:2022 and survives a UKAS assessment.
Validation versus Verification: Why the Distinction Matters
The two words are often used loosely, but under ISO 15189:2022 they describe two different obligations with different evidence requirements. Choosing the wrong one is one of the most common findings raised at assessment.
- Verification (clause 7.3.2) applies when you introduce a method already validated by someone else — almost always the manufacturer of a CE-marked or UKCA-marked in vitro diagnostic (IVD), whose performance claims are published in the instructions for use (IFU). Your job is to confirm, with your own limited data, that you can reproduce those claims in your laboratory, with your staff, on your samples, under your conditions.
- Validation (clause 7.3.3) applies when there is no validated claim to lean on, or when you have changed the method so the manufacturer's claim no longer holds. This is a fuller, more demanding evaluation in which the laboratory generates evidence for every relevant performance characteristic from scratch.
| Scenario | Obligation | Why | |----------|-----------|-----| | CE/UKCA-marked IVD used exactly per the IFU | Verification | Manufacturer has validated the intended use | | CE/UKCA-marked IVD used off-label (outside the IFU) | Validation | Claim no longer covers your use | | Laboratory-developed test (LDT / "in-house") | Validation | No manufacturer claim exists | | Modified commercial method (matrix, sample type, dilution) | Validation of the change | The change invalidates the original claim | | CE/UKCA assay with no published performance claim | Validation | Nothing to verify against |
A useful rule of thumb: if you cannot point to a documented, relevant performance claim, you are validating. In either case ISO 15189:2022 requires the laboratory to record the procedure and the results obtained, and to have appropriately authorised personnel review them before the method goes live.
Setting Acceptance Criteria Before You Start
The single biggest mistake in method evaluation is collecting data first and deciding what is acceptable afterwards. Acceptance criteria must be defined before the study runs, or you are simply rationalising whatever numbers you happen to get. ISO 15189:2022 expects performance to be judged against requirements appropriate to the intended clinical use.
In current UK practice, analytical performance specifications (APS) are most often derived using the hierarchy agreed at the 2014 Milan Strategic Conference and maintained by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM):
1. Model 1 — outcome-based: the effect of analytical performance on clinical outcomes (the ideal, but rarely available). 2. Model 2 — biological variation: specifications derived from within-subject and between-subject biological variation, using the EFLM Biological Variation Database — the most widely used model in clinical chemistry. 3. Model 3 — state of the art: the highest performance technically achievable, judged from External Quality Assessment (EQA) data — used when the higher models do not apply.
For each analyte, document the source of your specification (for example a biological-variation-derived allowable imprecision, bias and total allowable error, or a relevant EQA scheme limit). Where no formal APS exists — as for many qualitative or specialist assays — state your criteria qualitatively and justify them clinically.
Precision: Repeatability and Within-Laboratory Imprecision
Precision describes how close repeated measurements are to one another; the term you report is its inverse, imprecision, expressed as a coefficient of variation (CV%). Two components matter:
- Repeatability (within-run): replicates measured under identical conditions in a single run.
- Within-laboratory imprecision (intermediate precision): variation captured across days, calibrations, operators and reagent lots — the precision the method will actually deliver in service.
Practical points:
- Use stable QC material or pooled patient samples spanning the range, especially around critical thresholds (for example near a troponin cut-off or an HbA1c diagnostic value).
- Calculate repeatability and within-laboratory CV separately; do not report only the easier within-run figure.
- Compare results to your pre-defined imprecision specification, not merely to "looks reasonable".
Trueness and Bias: Are Your Results Correct?
Trueness is the closeness of agreement between the average of many measurements and a true or accepted reference value; the quantitative expression of the gap is bias. Verifying trueness establishes that your method does not systematically read high or low. Acceptable approaches, in descending order of strength, include:
1. Analysis of a certified reference material with a metrologically traceable assigned value. 2. Analysis of an EQA / proficiency testing sample or a commutable trueness control with a consensus or reference-method target. 3. Method comparison against an established, traceable method (CLSI EP09 principles), using patient samples across the range and assessing agreement with Passing–Bablok or Deming regression and a Bland–Altman difference plot.
For verification, demonstrating that your bias falls within the manufacturer's claim and your APS is usually sufficient. Record the comparator, the number of samples, the regression statistics and the conclusion. ISO 15189:2022 expects metrological traceability of calibration wherever a higher-order reference exists — bias studies are how you demonstrate it in practice.
Linearity and the Analytical Measuring Range
Linearity confirms that the relationship between analyte concentration and measured signal is proportional across the range, so that a result reported at, say, 40 means twice as much analyte as a result of 20.
- Prepare a series of samples — typically a minimum of five levels, and often more, spanning the claimed range — usually by admixing a high and a low pool in known proportions or using a linearity verification kit, following CLSI EP06 principles. The current (second) edition of EP06 no longer requires the levels to be exactly equally spaced, provided they cover the interval adequately.
- Run each level in replicate, plot measured against expected concentration, and assess for clinically significant non-linearity.
- The analytical measuring interval (AMI) — sometimes called the reportable range — is the span over which the method gives reliable results without dilution. The clinically reportable range extends this using validated dilution protocols; any dilution scheme must itself be verified, including recovery on dilution.
Limit of Blank, Detection and Quantitation
For analytes where low concentrations carry clinical weight — therapeutic drugs, tumour markers, troponin, viral loads — the laboratory must establish the lowest concentrations it can reliably detect and report. Following CLSI EP17 (Evaluation of Detection Capability), three terms are distinguished:
- Limit of Blank (LoB): the highest result likely from a blank (analyte-free) sample.
- Limit of Detection (LoD): the lowest concentration reliably distinguished from the LoB — the method can tell "present" from "absent" but cannot yet quantify accurately.
- Limit of Quantitation (LoQ): the lowest concentration that can be measured with stated, acceptable imprecision and bias — the true lower end of the reportable range.
Reference Intervals: Establish, Transfer or Verify
A method is not ready for service until the result can be interpreted. Reference intervals (and, for some tests, clinical decision limits) make that possible. ISO 15189:2022 requires the laboratory to define biological reference intervals, periodically review them, and inform users of any change. CLSI guideline EP28 describes three routes:
- Establish a de novo interval — collect samples from a large reference population (the classic guidance is at least 120 reference individuals per partition) and calculate the central 95%. This is resource-intensive and reserved for cases where no suitable interval exists.
- Transfer a published or manufacturer interval after confirming the analytical methods and reference populations are comparable.
- Verify a transferred interval using a small reference sample — commonly 20 healthy reference individuals. If no more than 2 of the 20 results (10%) fall outside the proposed limits, the interval may be adopted; if more than 2 fall outside, investigate and consider re-establishing.
Measurement Uncertainty: The Number Behind the Number
Clause 7.3.4 of ISO 15189:2022 requires laboratories to evaluate measurement uncertainty (MU) for quantitative measurands and to make estimates available to users on request. MU characterises the dispersion of values that could reasonably be attributed to the measurand — in plain terms, how much the reported result might vary if you measured the same sample again.
The pragmatic, widely adopted "top-down" approach builds MU from data you already generate:
1. Take the within-laboratory imprecision from your internal quality control over a representative period as the principal random component. 2. Add the uncertainty of the bias correction / calibrator value, drawn from EQA performance and the calibrator certificate where available. 3. Combine these to give a combined standard uncertainty, then multiply (commonly by a coverage factor k = 2) to give the expanded uncertainty at approximately 95% confidence.
Compare your MU against an allowable MU target derived from the same APS hierarchy used for your acceptance criteria. MU is not a one-off exercise: it must be reviewed as QC data accumulate and is increasingly scrutinised at UKAS assessment, so keep the calculation transparent and traceable.
Documenting and Authorising the Method for Go-Live
Evidence is only useful if organised and authorised. Before go-live, assemble a single, retrievable validation/verification file containing:
- the intended use and the validate-or-verify decision with its justification;
- the pre-defined acceptance criteria and their source (APS, EQA, IFU claim);
- the design and raw data for each performance characteristic studied;
- the outcome against each criterion, plus any limitations or off-label caveats;
- a dated authorisation by appropriately authorised personnel confirming the method is fit for clinical use.
Frequently Asked Questions
What is the difference between validation and verification under ISO 15189:2022?
Verification (clause 7.3.2) confirms you can reproduce a method's already-validated performance claims — typically the manufacturer's claims for a CE-marked or UKCA-marked IVD used exactly as instructed. Validation (clause 7.3.3) is the fuller generation of performance evidence from scratch, required for laboratory-developed tests, off-label use, or modified methods where no relevant claim exists. The simplest test: if there is no documented, relevant performance claim to check against, you are validating.
How many samples do I need to verify a reference interval?
The widely used CLSI EP28 verification route uses 20 healthy reference individuals from your target population. If no more than 2 of the 20 results (10%) fall outside the proposed interval, it can be adopted; if more than 2 fall outside, investigate and consider transferring a different interval or establishing your own. Establishing a brand-new interval is far more demanding, classically needing at least 120 reference individuals per partition.
Do I have to calculate measurement uncertainty for every test?
ISO 15189:2022 (clause 7.3.4) requires measurement uncertainty for quantitative measurands and that estimates be available to users on request; it is not expected in the same way for purely qualitative results. Most UK laboratories use the pragmatic top-down approach, combining within-laboratory imprecision from internal quality control with the uncertainty of the calibrator and bias correction, then reporting an expanded uncertainty at about 95% confidence. It must be kept under review as quality control data accumulate.
What are LoB, LoD and LoQ, and why does the distinction matter?
The Limit of Blank is the highest result expected from an analyte-free sample; the Limit of Detection is the lowest concentration reliably distinguished from blank; and the Limit of Quantitation is the lowest concentration measurable with acceptable precision and bias. They matter because reporting a numerical result between the LoD and LoQ implies an accuracy the method does not possess. The LoQ, not the LoD, defines the genuine lower end of the reportable range.
Where do I get acceptance criteria from if there is no published specification?
Use the Milan / EFLM hierarchy of analytical performance specifications: outcome-based first, then biological-variation-derived (from the EFLM Biological Variation Database), then state-of-the-art based on EQA performance. Where none of these apply — as with many qualitative or niche assays — define your criteria qualitatively, justify them against the clinical use of the result, and record that reasoning so an assessor can follow it.
Does a UKCA or CE mark mean I can skip evaluation?
No. A UKCA or CE mark means the manufacturer has validated the device for its stated intended use, but ISO 15189:2022 still requires your laboratory to verify you can reproduce the relevant performance claims in your own setting before clinical use. If you use the assay outside its instructions for use — a different sample type, matrix or dilution, for example — the claim no longer applies and you must perform a full validation of that use.
Further training
Method evaluation is one strand of the wider quality framework covered across the NHS Laboratory Training hub. To build on this article, explore these closely related guides:
- UKAS and ISO 15189 Accreditation: A Biomedical Scientist's Guide — how the standard is assessed and maintained.
- Quality Control in the NHS Lab: IQC, IQA and EQA Explained — the monitoring that feeds your uncertainty and precision data.
- Root Cause Analysis and CAPA in Pathology — what to do when a verified method drifts or fails.
- Laboratory Analyser Troubleshooting and Maintenance — keeping a validated method within specification.