Method Validation and Verification under ISO 15189:2022: A Practical Guide for Biomedical Scientists

Before any new analyser, assay or reagent kit produces a single patient result, the laboratory must prove the method works as intended in its own hands. Getting this wrong is a patient-safety problem, not a paperwork one, because every downstream clinical decision assumes the result is correct. This guide explains the difference between validation and verification, the performance characteristics you must study, and how to design and document each study so it satisfies ISO 15189:2022 and survives a UKAS assessment.

Validation versus Verification: Why the Distinction Matters

The two words are often used loosely, but under ISO 15189:2022 they describe two different obligations with different evidence requirements. Choosing the wrong one is one of the most common findings raised at assessment.

The decision tree is straightforward. Use the table below to determine which obligation applies before you design any study.

| Scenario | Obligation | Why | |----------|-----------|-----| | CE/UKCA-marked IVD used exactly per the IFU | Verification | Manufacturer has validated the intended use | | CE/UKCA-marked IVD used off-label (outside the IFU) | Validation | Claim no longer covers your use | | Laboratory-developed test (LDT / "in-house") | Validation | No manufacturer claim exists | | Modified commercial method (matrix, sample type, dilution) | Validation of the change | The change invalidates the original claim | | CE/UKCA assay with no published performance claim | Validation | Nothing to verify against |

A useful rule of thumb: if you cannot point to a documented, relevant performance claim, you are validating. In either case ISO 15189:2022 requires the laboratory to record the procedure and the results obtained, and to have appropriately authorised personnel review them before the method goes live.

Setting Acceptance Criteria Before You Start

The single biggest mistake in method evaluation is collecting data first and deciding what is acceptable afterwards. Acceptance criteria must be defined before the study runs, or you are simply rationalising whatever numbers you happen to get. ISO 15189:2022 expects performance to be judged against requirements appropriate to the intended clinical use.

In current UK practice, analytical performance specifications (APS) are most often derived using the hierarchy agreed at the 2014 Milan Strategic Conference and maintained by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM):

1. Model 1 — outcome-based: the effect of analytical performance on clinical outcomes (the ideal, but rarely available). 2. Model 2 — biological variation: specifications derived from within-subject and between-subject biological variation, using the EFLM Biological Variation Database — the most widely used model in clinical chemistry. 3. Model 3 — state of the art: the highest performance technically achievable, judged from External Quality Assessment (EQA) data — used when the higher models do not apply.

For each analyte, document the source of your specification (for example a biological-variation-derived allowable imprecision, bias and total allowable error, or a relevant EQA scheme limit). Where no formal APS exists — as for many qualitative or specialist assays — state your criteria qualitatively and justify them clinically.

Precision: Repeatability and Within-Laboratory Imprecision

Precision describes how close repeated measurements are to one another; the term you report is its inverse, imprecision, expressed as a coefficient of variation (CV%). Two components matter:

A typical verification design follows CLSI guideline EP15 (User Verification of Precision and Estimation of Bias): measure two or more concentrations near clinically important decision points, in replicate over five days, then compare the observed imprecision against the manufacturer's claim. A full validation uses a more rigorous design such as CLSI EP05 (Evaluation of Precision), commonly a 20-day, two-runs-per-day, two-replicates-per-run protocol.

Practical points:

Trueness and Bias: Are Your Results Correct?

Trueness is the closeness of agreement between the average of many measurements and a true or accepted reference value; the quantitative expression of the gap is bias. Verifying trueness establishes that your method does not systematically read high or low. Acceptable approaches, in descending order of strength, include:

1. Analysis of a certified reference material with a metrologically traceable assigned value. 2. Analysis of an EQA / proficiency testing sample or a commutable trueness control with a consensus or reference-method target. 3. Method comparison against an established, traceable method (CLSI EP09 principles), using patient samples across the range and assessing agreement with Passing–Bablok or Deming regression and a Bland–Altman difference plot.

For verification, demonstrating that your bias falls within the manufacturer's claim and your APS is usually sufficient. Record the comparator, the number of samples, the regression statistics and the conclusion. ISO 15189:2022 expects metrological traceability of calibration wherever a higher-order reference exists — bias studies are how you demonstrate it in practice.

Linearity and the Analytical Measuring Range

Linearity confirms that the relationship between analyte concentration and measured signal is proportional across the range, so that a result reported at, say, 40 means twice as much analyte as a result of 20.

Document where the method becomes non-linear at the extremes, as that defines the limits within which numerical results may be authorised. Results outside the AMI must be reported with the appropriate convention (for example "<" or ">", or with a verified dilution).

Limit of Blank, Detection and Quantitation

For analytes where low concentrations carry clinical weight — therapeutic drugs, tumour markers, troponin, viral loads — the laboratory must establish the lowest concentrations it can reliably detect and report. Following CLSI EP17 (Evaluation of Detection Capability), three terms are distinguished:

For verification, replicate measurement of blank and low-level samples to confirm the manufacturer's claimed LoD/LoQ is typically adequate. For validation, the full EP17 estimation is required. Always make clear that LoD and LoQ are not interchangeable: reporting a number at a concentration between LoD and LoQ implies a precision the method does not have.

Reference Intervals: Establish, Transfer or Verify

A method is not ready for service until the result can be interpreted. Reference intervals (and, for some tests, clinical decision limits) make that possible. ISO 15189:2022 requires the laboratory to define biological reference intervals, periodically review them, and inform users of any change. CLSI guideline EP28 describes three routes:

Document the source of the interval, the population partitions (age, sex, and where relevant ethnicity or pregnancy), and the verification outcome. Intervals from a different analytical platform are not automatically transferable, and some measurands use harmonised, guideline-driven decision limits rather than population-derived intervals.

Measurement Uncertainty: The Number Behind the Number

Clause 7.3.4 of ISO 15189:2022 requires laboratories to evaluate measurement uncertainty (MU) for quantitative measurands and to make estimates available to users on request. MU characterises the dispersion of values that could reasonably be attributed to the measurand — in plain terms, how much the reported result might vary if you measured the same sample again.

The pragmatic, widely adopted "top-down" approach builds MU from data you already generate:

1. Take the within-laboratory imprecision from your internal quality control over a representative period as the principal random component. 2. Add the uncertainty of the bias correction / calibrator value, drawn from EQA performance and the calibrator certificate where available. 3. Combine these to give a combined standard uncertainty, then multiply (commonly by a coverage factor k = 2) to give the expanded uncertainty at approximately 95% confidence.

Compare your MU against an allowable MU target derived from the same APS hierarchy used for your acceptance criteria. MU is not a one-off exercise: it must be reviewed as QC data accumulate and is increasingly scrutinised at UKAS assessment, so keep the calculation transparent and traceable.

Documenting and Authorising the Method for Go-Live

Evidence is only useful if organised and authorised. Before go-live, assemble a single, retrievable validation/verification file containing:

This file underpins the controlled examination procedure (the SOP), feeds the UKAS schedule of accreditation, and is the first thing an assessor asks for when a method is questioned. A useful current reference is the joint Validation Guidance for Medical Laboratories produced by the Institute of Biomedical Science (IBMS), the Association for Laboratory Medicine (LabMed) and the Medicines and Healthcare products Regulatory Agency (MHRA), with input from the United Kingdom Accreditation Service (UKAS), which sets out a principles-based, risk-proportionate framework for how UK laboratories should approach IVD verification and validation in line with ISO 15189.

Frequently Asked Questions

What is the difference between validation and verification under ISO 15189:2022?

Verification (clause 7.3.2) confirms you can reproduce a method's already-validated performance claims — typically the manufacturer's claims for a CE-marked or UKCA-marked IVD used exactly as instructed. Validation (clause 7.3.3) is the fuller generation of performance evidence from scratch, required for laboratory-developed tests, off-label use, or modified methods where no relevant claim exists. The simplest test: if there is no documented, relevant performance claim to check against, you are validating.

How many samples do I need to verify a reference interval?

The widely used CLSI EP28 verification route uses 20 healthy reference individuals from your target population. If no more than 2 of the 20 results (10%) fall outside the proposed interval, it can be adopted; if more than 2 fall outside, investigate and consider transferring a different interval or establishing your own. Establishing a brand-new interval is far more demanding, classically needing at least 120 reference individuals per partition.

Do I have to calculate measurement uncertainty for every test?

ISO 15189:2022 (clause 7.3.4) requires measurement uncertainty for quantitative measurands and that estimates be available to users on request; it is not expected in the same way for purely qualitative results. Most UK laboratories use the pragmatic top-down approach, combining within-laboratory imprecision from internal quality control with the uncertainty of the calibrator and bias correction, then reporting an expanded uncertainty at about 95% confidence. It must be kept under review as quality control data accumulate.

What are LoB, LoD and LoQ, and why does the distinction matter?

The Limit of Blank is the highest result expected from an analyte-free sample; the Limit of Detection is the lowest concentration reliably distinguished from blank; and the Limit of Quantitation is the lowest concentration measurable with acceptable precision and bias. They matter because reporting a numerical result between the LoD and LoQ implies an accuracy the method does not possess. The LoQ, not the LoD, defines the genuine lower end of the reportable range.

Where do I get acceptance criteria from if there is no published specification?

Use the Milan / EFLM hierarchy of analytical performance specifications: outcome-based first, then biological-variation-derived (from the EFLM Biological Variation Database), then state-of-the-art based on EQA performance. Where none of these apply — as with many qualitative or niche assays — define your criteria qualitatively, justify them against the clinical use of the result, and record that reasoning so an assessor can follow it.

Does a UKCA or CE mark mean I can skip evaluation?

No. A UKCA or CE mark means the manufacturer has validated the device for its stated intended use, but ISO 15189:2022 still requires your laboratory to verify you can reproduce the relevant performance claims in your own setting before clinical use. If you use the assay outside its instructions for use — a different sample type, matrix or dilution, for example — the claim no longer applies and you must perform a full validation of that use.

Further training

Method evaluation is one strand of the wider quality framework covered across the NHS Laboratory Training hub. To build on this article, explore these closely related guides: