Methods for assessing whether a model or system meets requirements and behaves safely and reliably (e.g., unit tests, red teaming, evals, regression tests). Testing results are often used in acceptance, governance, and incident response.
See: Acceptance criteria; Change control; Evaluation (evals); Red teaming