acceptance criteriavalidation protocolaudit readinessIQ OQ PQregulated manufacturingtraceabilityGAMP 5data integrity

How to Write Acceptance Criteria That Won't Get Flagged in an Audit

Manufacturing Automation Engineer|April 27, 2026|9 min read|
How to Write Acceptance Criteria That Won't Get Flagged in an Audit

You have run the test. The results look good. The operator signed off. Then an auditor sits down with your protocol and asks: "What does within acceptable range mean?"

You don't have a good answer. Neither does the protocol.

That is the moment vague acceptance criteria become a finding. Not because the process failed. Not because the product was defective. Because the documentation could not demonstrate, on its own, that anything was actually controlled.

Acceptance criteria are consistently among the most heavily reviewed elements in a validation protocol. They are the line where "we tested it" becomes "we proved it." Everything upstream (the URS, the design qualification, the risk assessment) is context. The acceptance criteria are the verdict. If they are soft, subjective, or untraceable, an auditor has no choice but to question the entire test.

This post covers exactly what makes criteria auditable, the four failure modes that generate findings, and a practical structure you can use to write criteria that hold up to a typical audit review.

---

What Makes Acceptance Criteria Auditable

An auditable acceptance criterion has three properties: it is measurable, it was defined before testing, and it traces back to a documented source.

Measurable means a trained reviewer, without any additional context, can determine whether a result passes or fails by reading the criterion alone. Numbers, units, and a stated measurement method are the minimum. If interpretation is required, the criterion is not measurable.

Pre-defined means the limit was written before anyone looked at a result. Pre-defining criteria is not just good practice. Under EU Annex 15 (Qualification and Validation), 21 CFR Part 820.75 (Process Validation), the FDA Process Validation Guidance, EU Annex 11, and GAMP 5, the expectation is that acceptance criteria are established and approved before protocol execution. A criterion written or adjusted after a result is known, without a formal deviation and re-approval, is at minimum a documentation control failure and can rise to a data integrity finding.

Traceable means there is a documented path from the criterion back to a specification, standard, or regulatory requirement. A temperature range does not exist in isolation. It comes from a process parameter in a batch record, a manufacturer specification in a manual, a pharmacopeial standard, or a risk-based limit defined in a PFMEA. If an auditor cannot follow that path, the criterion has no foundation.

If a criterion satisfies all three, it will survive review. If it fails any one of them, it is a finding waiting to happen.

---

The Four Failure Modes Auditors Flag

These are the patterns that generate observations in many validation audits. If you recognize your own protocols in this list, fix them before the next review.

1. Vague Language

"Within acceptable range." "Appears normal." "As expected." "No anomalies observed." These phrases communicate nothing. They describe a feeling, not a result. An auditor cannot verify that a result passed a criterion that has no defined boundary. They will flag this as a missing or inadequate acceptance criterion, which typically requires a deviation, a retrospective justification, and sometimes a requalification.

The fix is always the same: replace the subjective phrase with a numeric limit, a defined classification scale, or a binary pass/fail condition that is itself defined in a referenced procedure.

2. Missing Units

"Temperature shall not exceed 25" means nothing without a unit. Celsius and Fahrenheit are not interchangeable. A criterion missing its unit is incomplete. Auditors will flag it. More importantly, an operator executing the protocol has no way to determine compliance without making an assumption, which introduces risk.

Every numeric criterion needs a unit. Every time. No exceptions.

3. Criteria Set After Results Are Known

This is the documentation integrity and change-control failure mode, and depending on what happened to the records, it can rise to a data integrity finding under ALCOA+ expectations. If a protocol was executed, results were recorded, and then the acceptance criteria section was completed (or amended without a formal deviation and re-approval process), the documentation record is compromised. The appearance of backdating or retrofitting criteria to match results is treated as seriously as actual backdating.

Even if the intent was innocent (say, a template was used and the criteria column was accidentally left blank during authoring), the execution record with blank criteria represents a protocol that should never have been released for execution.

4. No Traceability to Source Specification

A criterion that appeared from nowhere is not a criterion. It is a guess. If a protocol states that a temperature excursion limit is 2 to 8 degrees Celsius but there is no reference to the approved storage specification, the validated operating range, or a regulatory standard that establishes that limit, an auditor will ask where it came from. If you cannot answer that question from the document itself, it is a finding.

Traceability does not require repeating the entire specification in the criterion. It requires a reference: a document number, a section, a version. That reference is the audit trail.

---

How to Write Criteria That Pass

The structure that works is this:

[Parameter] shall [verb] [numeric limit] [units] as measured by [method], per [source document, section].

That is the template. Use it. Adapt it. But keep all five components.

Here are before-and-after examples across common test types.

---

Example 1: Temperature Mapping (OQ)

Before:

> Chamber temperature shall remain within acceptable range during the hold period.

After:

> Chamber temperature at each mapped location shall be between 2.0 and 8.0 degrees Celsius throughout the 24-hour hold period, as measured by calibrated thermocouple data logger (per SOP-CALIB-014), per the Approved Storage Specification DS-0042 Rev. C, Section 4.2.

What changed: the subjective phrase is gone. The limit is numeric. The unit is explicit. The measurement method is named. The source is cited.

---

Example 2: Fill Volume (PQ)

Before:

> Fill volume shall be approximately correct for each filled unit sampled.

After:

> Fill volume for each sampled unit shall be between 9.8 mL and 10.2 mL, as measured by gravimetric analysis per SOP-QC-031, per Product Specification PS-1105 Rev. B, Section 3.1 (nominal fill 10.0 mL plus or minus 2 percent).

What changed: "approximately correct" is replaced with a defined range derived from the product specification. The measurement method is cited. The relationship between the tolerance and the source spec is transparent.

---

Example 3: Cleaning Validation Final Rinse Conductivity (OQ)

Before:

> Rinse water conductivity shall meet the expected value for purified water.

After:

> Final rinse water conductivity shall be no greater than 1.3 µS/cm at 25 degrees Celsius, as measured by calibrated inline conductivity meter (per SOP-CALIB-007), per USP <645> Water Conductivity, Stage 1 limit. (This verifies that the final rinse water itself meets purified water specification. Residue removal from product contact surfaces is verified separately by a process-specific limit derived from a MACO calculation, swab and rinse recovery studies, and the cleaning validation strategy.)

What changed: "expected value" is replaced with the specific limit from a cited pharmacopeial standard, and the criterion is properly scoped to water quality verification rather than residue removal.

---

Example 4: Software Alarm Response (CSV / OQ)

Before:

> The system shall generate an alarm when temperature exceeds the set point.

After:

> When chamber temperature exceeds 8.5 degrees Celsius, the system shall generate an audible alarm and a visible alert on the operator interface within 30 seconds of the exceedance, as verified by review of the system event log timestamps for the temperature exceedance event and the alarm event, per Functional Specification FS-0088 Rev. A, Section 6.3.

What changed: "generates an alarm" tells an auditor nothing about what the alarm looks like, how it is triggered, or how fast it must respond. The revised criterion defines all three, and uses the system's own event log for the timing verification rather than human reaction time.

---

Handling Ranges vs. Single-Point Limits

The type of limit you use matters as much as the limit itself.

Use a two-sided range (min and max) when both directions of deviation represent a real risk. Fill volume, pH, temperature hold ranges, and humidity controls almost always require two-sided limits. The range should be derived from the specification, not guessed.

Use a one-sided limit (no greater than, no less than) when only one direction of deviation is a risk. Bioburden limits, residual cleaning agent concentrations, and particle counts are typically one-sided. Using a two-sided limit here introduces an artificial lower bound that has no scientific or regulatory basis and creates unnecessary failures.

Use an exact value only when the test is binary by nature: a relay either closes at the specified voltage or it does not. For most process parameters, an exact value is inappropriate because it cannot account for measurement system variation.

State the statistical basis when using a limit derived from process capability or sampling plans. If a criterion applies to an AQL-based sample rather than 100 percent inspection, the protocol must state the AQL level, the sampling plan reference (ANSI/ASQ Z1.4 for attributes inspection or Z1.9 for variables inspection, for example), and whether the limit is applied to the individual result or the lot disposition.

---

Traceability: Linking Criteria Back to Specifications

Traceability is the chain that connects a test result to a regulatory requirement. It is what allows an auditor to answer the question: "How do you know this limit is correct?"

The chain typically looks like this:

Regulatory requirement or standard defines the general control. User Requirement Specification (URS) captures what the system must do to meet that requirement. Design qualification (DQ) or functional specification translates the URS requirement into engineering parameters. Manufacturer specification or pharmacopeial standard provides the numeric basis for the limit. Acceptance criterion in the protocol cites the specification and reflects the approved limit.

Every criterion in a protocol should be traceable to at least one link in that chain. What auditors look for is that the reference is documented, the version is identified, and the criterion in the protocol matches the cited source.

A mismatch between the criterion and the cited source, even a rounding difference, is an observation. Pull the source document and confirm the numbers match exactly before the protocol goes to review.

For a deeper look at how auditors evaluate the overall documentation package, see What Auditors Actually Look for in Validation Documentation.

If you are working through a full OQ protocol and need the broader structure around acceptance criteria, How to Write an OQ Protocol From Scratch covers the protocol architecture that makes criteria traceable from the start.

---

Common Edge Cases

Visual Inspections

Visual inspection criteria are the most commonly miswritten criteria in validation documentation. "No visible contamination" is not an acceptance criterion. It is a hope.

Auditable visual inspection criteria do one of two things: they define the defect classification system being applied, or they specify the reference standard being used for comparison. For cosmetic or appearance inspections, reference a written defect classification procedure. For visible particles in injectables, reference USP <790>. For ophthalmic preparations, USP <771> applies. For other contexts, reference the relevant compendial method or a validated inspection procedure with a documented acceptance standard. The standard you cite must match the product type.

Software Behavior

Software acceptance criteria are frequently written as functional descriptions rather than verifiable tests. "The system shall log all events" is not an acceptance criterion. "The system shall record a timestamped audit trail entry for each of the following user actions" followed by an enumerated list is an acceptance criterion.

For each software test, the criterion should define the specific output, the expected state, and the method of verification. Criteria for CSV testing are also expected to trace to the functional specification or software requirements specification.

Environmental Monitoring

For a Grade B cleanroom, the particle count criterion must cite EU GMP Annex 1, which defines the acceptance limits for Grade A, B, C, and D in operation and at rest. ISO 14644-1 may also be referenced for the cleanroom classification methodology, but the acceptance limits themselves come from Annex 1 for GMP areas.

---

Getting Criteria Right Before Execution

Acceptance criteria cannot be an afterthought. They are not the last box to fill in before the protocol goes for signatures. They are the specification against which the entire test will be evaluated.

The time to get criteria right is during protocol authoring, during the peer review, and during the formal approval step. By the time a protocol is released for execution, every criterion should be measurable, pre-defined, and traceable to a source document that has been verified to exist and to contain the stated limit.

If you are unsure which tests in a given qualification phase require formal acceptance criteria versus operational checks, How to Determine if Equipment Needs IQ Only or Full IQ/OQ/PQ covers the scoping decision that determines your documentation requirements.

Valiqa generates acceptance criteria and regulatory mappings as part of protocol authoring, structured around the parameter, method, numeric limit, unit, and source reference, with a built-in traceability matrix.

---

Valiqa is an AI-powered validation lifecycle platform for regulated manufacturing. Learn more at valiqa.io

Frequently Asked Questions

Ready to automate your validation documentation?

Generate audit-ready IQ/OQ/PQ protocols in minutes, not weeks.

Get Started

We use essential cookies for authentication and security. With your consent, we also use Microsoft Clarity on our marketing pages to understand how visitors navigate the site. Learn more.