Can I write acceptance criteria after seeing the test results?

No. Writing or modifying acceptance criteria after results are known should be treated as a major deviation. If records were altered to match results, it can rise to a data integrity finding under ALCOA+ expectations and (for electronic records) Part 11 or EU Annex 11 controls. The correct path is to raise a deviation, assess the impact, and determine whether re-execution is required. Retrofitting criteria to match results is treated as seriously as backdating records.

How specific do acceptance criteria need to be?

Specific enough that a trained reviewer can independently determine compliance without asking anyone a question. That means a numeric limit with a unit, a stated measurement method, and a reference to the source specification. A criterion like temperature between 2.0 and 8.0 degrees Celsius as measured by calibrated thermocouple per SOP-CALIB-014 is specific enough. Temperature within acceptable range is not.

What is traceability in acceptance criteria?

Traceability is the documented link between an acceptance criterion and the specification, standard, or regulatory requirement that establishes the limit. The chain typically runs from a regulatory requirement through the URS and design qualification to a product or equipment specification, and then into the protocol criterion. Auditors follow this chain to verify that limits are scientifically and regulatorily justified, not arbitrarily assigned.

How do you write acceptance criteria for visual inspections?

Visual inspection criteria must reference a written defect classification procedure or an established standard, not subjective descriptions. For visible particles in injectables, USP applies. For ophthalmic preparations, USP applies. Other product types should reference the relevant compendial method or a validated inspection procedure. The inspector's judgment is only auditable when it is calibrated against a documented reference standard that matches the product type being inspected.

What happens if results fall outside acceptance criteria?

A result outside acceptance criteria requires a deviation and a documented investigation before the protocol can be closed. The process cannot be qualified or validated until every acceptance criterion is met or the deviation is formally dispositioned with documented justification. Note that out-of-specification (OOS) under 21 CFR 211.192 is a specific QC laboratory process for results against release specifications. Protocol-level acceptance criteria failures are deviations or nonconformances. Closing a protocol with unresolved failures is a major finding.

Do acceptance criteria need to appear in the protocol before it is approved, or can they be added later?

Acceptance criteria must be present and approved before the protocol is released for execution. A protocol approved without defined criteria should not be executed. Adding criteria after approval requires a formal protocol amendment with documented rationale, re-review, and re-approval before any testing occurs against the revised criteria.

What is the difference between a specification and an acceptance criterion?

A specification defines the required attribute of a product, material, or system in a controlled document such as a product specification or equipment design specification. An acceptance criterion in a validation protocol is the testable expression of that specification for a specific test step. The criterion cites the specification as its source and the numbers must match exactly.

How do I account for measurement uncertainty in acceptance criteria?

If the measurement system has known uncertainty, the acceptance criterion should account for it. The most common approaches are guard-banding (tightening the protocol limit by the measurement uncertainty so a passing result plus its uncertainty still falls within the specification), and ensuring the measurement method's uncertainty is small relative to the specification width (a working rule of thumb is uncertainty under 25 percent of the spec window). Either way, the calibration certificate for the measurement device, the measurement uncertainty value, and the rationale should be documented in or referenced from the protocol. JCGM 100 (Guide to the Expression of Uncertainty in Measurement) is the standard reference.

When should I use a one-sided limit instead of a range?

Use a one-sided limit (no greater than X, or no less than X) when only one direction of deviation represents an actual risk to product quality, patient safety, or process control. Bioburden, residual cleaning agent concentrations, particle counts, and endotoxin limits are typical one-sided cases. Use a two-sided range when both directions matter: fill volume, pH, temperature hold ranges, and humidity all require two-sided limits because over and under values both create risk. Imposing a two-sided limit on a one-sided parameter introduces an artificial lower bound with no scientific basis and creates unnecessary deviations during execution.

Back to Blog

acceptance criteriavalidation protocolaudit readinessIQ OQ PQregulated manufacturingtraceabilityGAMP 5data integrity

How to Write Acceptance Criteria That Won't Get Flagged in an Audit

Q: What makes acceptance criteria auditable?

Auditable acceptance criteria are measurable, pre-defined, and traceable to a documented source. Measurable means a reviewer can determine pass or fail without additional context, using a stated numeric limit and unit. Pre-defined means the limit was established before testing began. Traceable means the criterion references the specification, standard, or regulatory requirement from which the limit is derived.

Manufacturing Automation Engineer|April 27, 2026|9 min read|

How to Write Acceptance Criteria That Won't Get Flagged in an Audit

You have run the test. The results look good. The operator signed off. Then an auditor sits down with your protocol and asks: "What does within acceptable range mean?"

You don't have a good answer. Neither does the protocol.

That is the moment vague acceptance criteria become a finding. Not because the process failed. Not because the product was defective. Because the documentation could not demonstrate, on its own, that anything was actually controlled.

Acceptance criteria are consistently among the most heavily reviewed elements in a validation protocol. They are the line where "we tested it" becomes "we proved it." Everything upstream (the URS, the design qualification, the risk assessment) is context. The acceptance criteria are the verdict. If they are soft, subjective, or untraceable, an auditor has no choice but to question the entire test.

This post covers exactly what makes criteria auditable, the four failure modes that generate findings, and a practical structure you can use to write criteria that hold up to a typical audit review.

---

What Makes Acceptance Criteria Auditable

An auditable acceptance criterion has three properties: it is measurable, it was defined before testing, and it traces back to a documented source.

Measurable means a trained reviewer, without any additional context, can determine whether a result passes or fails by reading the criterion alone. Numbers, units, and a stated measurement method are the minimum. If interpretation is required, the criterion is not measurable.

Pre-defined means the limit was written before anyone looked at a result. Pre-defining criteria is not just good practice. Under EU Annex 15 (Qualification and Validation), 21 CFR Part 820.75 (Process Validation), the FDA Process Validation Guidance, EU Annex 11, and GAMP 5, the expectation is that acceptance criteria are established and approved before protocol execution. A criterion written or adjusted after a result is known, without a formal deviation and re-approval, is at minimum a documentation control failure and can rise to a data integrity finding.

Traceable means there is a documented path from the criterion back to a specification, standard, or regulatory requirement. A temperature range does not exist in isolation. It comes from a process parameter in a batch record, a manufacturer specification in a manual, a pharmacopeial standard, or a risk-based limit defined in a PFMEA. If an auditor cannot follow that path, the criterion has no foundation.

If a criterion satisfies all three, it will survive review. If it fails any one of them, it is a finding waiting to happen.

---

The Four Failure Modes Auditors Flag

These are the patterns that generate observations in many validation audits. If you recognize your own protocols in this list, fix them before the next review.

1. Vague Language

"Within acceptable range." "Appears normal." "As expected." "No anomalies observed." These phrases communicate nothing. They describe a feeling, not a result. An auditor cannot verify that a result passed a criterion that has no defined boundary. They will flag this as a missing or inadequate acceptance criterion, which typically requires a deviation, a retrospective justification, and sometimes a requalification.

The fix is always the same: replace the subjective phrase with a numeric limit, a defined classification scale, or a binary pass/fail condition that is itself defined in a referenced procedure.

2. Missing Units

"Temperature shall not exceed 25" means nothing without a unit. Celsius and Fahrenheit are not interchangeable. A criterion missing its unit is incomplete. Auditors will flag it. More importantly, an operator executing the protocol has no way to determine compliance without making an assumption, which introduces risk.

Every numeric criterion needs a unit. Every time. No exceptions.

3. Criteria Set After Results Are Known

This is the documentation integrity and change-control failure mode, and depending on what happened to the records, it can rise to a data integrity finding under ALCOA+ expectations. If a protocol was executed, results were recorded, and then the acceptance criteria section was completed (or amended without a formal deviation and re-approval process), the documentation record is compromised. The appearance of backdating or retrofitting criteria to match results is treated as seriously as actual backdating.

Even if the intent was innocent (say, a template was used and the criteria column was accidentally left blank during authoring), the execution record with blank criteria represents a protocol that should never have been released for execution.

4. No Traceability to Source Specification

A criterion that appeared from nowhere is not a criterion. It is a guess. If a protocol states that a temperature excursion limit is 2 to 8 degrees Celsius but there is no reference to the approved storage specification, the validated operating range, or a regulatory standard that establishes that limit, an auditor will ask where it came from. If you cannot answer that question from the document itself, it is a finding.

Traceability does not require repeating the entire specification in the criterion. It requires a reference: a document number, a section, a version. That reference is the audit trail.

---

How to Write Criteria That Pass

The structure that works is this:

[Parameter] shall [verb] [numeric limit] [units] as measured by [method], per [source document, section].

That is the template. Use it. Adapt it. But keep all five components.

Here are before-and-after examples across common test types.

---

Example 1: Temperature Mapping (OQ)

Before:

> Chamber temperature shall remain within acceptable range during the hold period.

After:

> Chamber temperature at each mapped location shall be between 2.0 and 8.0 degrees Celsius throughout the 24-hour hold period, as measured by calibrated thermocouple data logger (per SOP-CALIB-014), per the Approved Storage Specification DS-0042 Rev. C, Section 4.2.

What changed: the subjective phrase is gone. The limit is numeric. The unit is explicit. The measurement method is named. The source is cited.

---

Example 2: Fill Volume (PQ)

Before:

> Fill volume shall be approximately correct for each filled unit sampled.

After:

> Fill volume for each sampled unit shall be between 9.8 mL and 10.2 mL, as measured by gravimetric analysis per SOP-QC-031, per Product Specification PS-1105 Rev. B, Section 3.1 (nominal fill 10.0 mL plus or minus 2 percent).

What changed: "approximately correct" is replaced with a defined range derived from the product specification. The measurement method is cited. The relationship between the tolerance and the source spec is transparent.

---

Example 3: Cleaning Validation Final Rinse Conductivity (OQ)

Before:

> Rinse water conductivity shall meet the expected value for purified water.

After:

> Final rinse water conductivity shall be no greater than 1.3 µS/cm at 25 degrees Celsius, as measured by calibrated inline conductivity meter (per SOP-CALIB-007), per USP <645> Water Conductivity, Stage 1 limit. (This verifies that the final rinse water itself meets purified water specification. Residue removal from product contact surfaces is verified separately by a process-specific limit derived from a MACO calculation, swab and rinse recovery studies, and the cleaning validation strategy.)

What changed: "expected value" is replaced with the specific limit from a cited pharmacopeial standard, and the criterion is properly scoped to water quality verification rather than residue removal.

---

Example 4: Software Alarm Response (CSV / OQ)

Before:

> The system shall generate an alarm when temperature exceeds the set point.

After:

> When chamber temperature exceeds 8.5 degrees Celsius, the system shall generate an audible alarm and a visible alert on the operator interface within 30 seconds of the exceedance, as verified by review of the system event log timestamps for the temperature exceedance event and the alarm event, per Functional Specification FS-0088 Rev. A, Section 6.3.

What changed: "generates an alarm" tells an auditor nothing about what the alarm looks like, how it is triggered, or how fast it must respond. The revised criterion defines all three, and uses the system's own event log for the timing verification rather than human reaction time.

---

Handling Ranges vs. Single-Point Limits

The type of limit you use matters as much as the limit itself.

Use a two-sided range (min and max) when both directions of deviation represent a real risk. Fill volume, pH, temperature hold ranges, and humidity controls almost always require two-sided limits. The range should be derived from the specification, not guessed.

Use a one-sided limit (no greater than, no less than) when only one direction of deviation is a risk. Bioburden limits, residual cleaning agent concentrations, and particle counts are typically one-sided. Using a two-sided limit here introduces an artificial lower bound that has no scientific or regulatory basis and creates unnecessary failures.

Use an exact value only when the test is binary by nature: a relay either closes at the specified voltage or it does not. For most process parameters, an exact value is inappropriate because it cannot account for measurement system variation.

State the statistical basis when using a limit derived from process capability or sampling plans. If a criterion applies to an AQL-based sample rather than 100 percent inspection, the protocol must state the AQL level, the sampling plan reference (ANSI/ASQ Z1.4 for attributes inspection or Z1.9 for variables inspection, for example), and whether the limit is applied to the individual result or the lot disposition.

---

Traceability: Linking Criteria Back to Specifications

Traceability is the chain that connects a test result to a regulatory requirement. It is what allows an auditor to answer the question: "How do you know this limit is correct?"

The chain typically looks like this:

Regulatory requirement or standard defines the general control. User Requirement Specification (URS) captures what the system must do to meet that requirement. Design qualification (DQ) or functional specification translates the URS requirement into engineering parameters. Manufacturer specification or pharmacopeial standard provides the numeric basis for the limit. Acceptance criterion in the protocol cites the specification and reflects the approved limit.

Every criterion in a protocol should be traceable to at least one link in that chain. What auditors look for is that the reference is documented, the version is identified, and the criterion in the protocol matches the cited source.

A mismatch between the criterion and the cited source, even a rounding difference, is an observation. Pull the source document and confirm the numbers match exactly before the protocol goes to review.

For a deeper look at how auditors evaluate the overall documentation package, see What Auditors Actually Look for in Validation Documentation.

If you are working through a full OQ protocol and need the broader structure around acceptance criteria, How to Write an OQ Protocol From Scratch covers the protocol architecture that makes criteria traceable from the start.

---

Common Edge Cases

Visual Inspections

Visual inspection criteria are the most commonly miswritten criteria in validation documentation. "No visible contamination" is not an acceptance criterion. It is a hope.

Auditable visual inspection criteria do one of two things: they define the defect classification system being applied, or they specify the reference standard being used for comparison. For cosmetic or appearance inspections, reference a written defect classification procedure. For visible particles in injectables, reference USP <790>. For ophthalmic preparations, USP <771> applies. For other contexts, reference the relevant compendial method or a validated inspection procedure with a documented acceptance standard. The standard you cite must match the product type.

Software Behavior

Software acceptance criteria are frequently written as functional descriptions rather than verifiable tests. "The system shall log all events" is not an acceptance criterion. "The system shall record a timestamped audit trail entry for each of the following user actions" followed by an enumerated list is an acceptance criterion.

For each software test, the criterion should define the specific output, the expected state, and the method of verification. Criteria for CSV testing are also expected to trace to the functional specification or software requirements specification.

Environmental Monitoring

For a Grade B cleanroom, the particle count criterion must cite EU GMP Annex 1, which defines the acceptance limits for Grade A, B, C, and D in operation and at rest. ISO 14644-1 may also be referenced for the cleanroom classification methodology, but the acceptance limits themselves come from Annex 1 for GMP areas.

---

Getting Criteria Right Before Execution

Acceptance criteria cannot be an afterthought. They are not the last box to fill in before the protocol goes for signatures. They are the specification against which the entire test will be evaluated.

The time to get criteria right is during protocol authoring, during the peer review, and during the formal approval step. By the time a protocol is released for execution, every criterion should be measurable, pre-defined, and traceable to a source document that has been verified to exist and to contain the stated limit.

If you are unsure which tests in a given qualification phase require formal acceptance criteria versus operational checks, How to Determine if Equipment Needs IQ Only or Full IQ/OQ/PQ covers the scoping decision that determines your documentation requirements.

Valiqa generates acceptance criteria and regulatory mappings as part of protocol authoring, structured around the parameter, method, numeric limit, unit, and source reference, with a built-in traceability matrix.

---

Valiqa is an AI-powered validation lifecycle platform for regulated manufacturing. Learn more at valiqa.io

Frequently Asked Questions

Ready to automate your validation documentation?

Generate audit-ready IQ/OQ/PQ protocols in minutes, not weeks.

Get Started

We use essential cookies for authentication and security. With your consent, we also use Microsoft Clarity on our marketing pages to understand how visitors navigate the site. Learn more.