Informative Appendix (non-normative)
This appendix provides an illustrative template for documenting model changes, behavioral comparisons, drift detections, human review, and rollback decisions for autonomous penetration testing platforms. It is intended to help platform operators, customers, and reviewers collect evidence for existing APTS requirements. It does not create or modify any APTS requirement.
APTS requires operators to pin model versions, track AI/ML model changes, validate behavior after changes, detect behavioral drift, and re-attest after material foundation model changes. Those activities often involve several teams and artifacts, so customers and reviewers benefit from one compact record that connects the change-management decision to the evidence that supports it.
This appendix provides:
Consider using this record when documenting:
A model change and drift record should:
Use stable identifiers so the record can be linked to change tickets, conformance claims, drift alerts, and customer notifications.
Recommended fields:
record_idchange_ticket_idrecord_typestatuscreated_atlast_updated_atownerreviewersSuggested record_type values:
planned_model_changeemergency_model_changeprovider_side_driftrollbackpost_change_reviewRecord both the prior and candidate model configurations. The exact version format may vary by provider, but the record should be specific enough for review and rollback.
Recommended fields:
providermodel_familymodel_namemodel_versiondeployment_environmentregion_or_endpointinference_routefallback_routesystem_policy_referencetool_policy_referencebehavioral_fingerprint_referenceExplain what changed and whether the change is material under the operator's APTS-TP-022 policy.
Recommended fields:
change_typechange_summaryreason_for_changemateriality_decisionmateriality_basisaffected_domainsrequires_re_attestationcustomer_notification_requiredSuggested change_type values:
provider_changemodel_family_changemodel_version_updatefine_tune_or_adapter_changesystem_prompt_or_policy_changetool_use_or_action_space_changefallback_provider_changerouting_policy_changedetected_drift_without_operator_changeCapture the comparison between the previous model behavior and the candidate or observed model behavior, especially on safety-critical decisions.
Recommended fields:
test_set_referencebaseline_run_idcandidate_run_idscope_decision_deltaescalation_decision_deltaimpact_classification_deltamanipulation_resistance_deltasafety_control_deltasummary_resultevidence_referencesUse this section when the operator detects model behavior changes that were not introduced by an operator-controlled deployment.
Recommended fields:
drift_alert_iddetected_atdetection_sourcebaseline_referenceobserved_deviationaffected_decision_pathstolerance_thresholdthreshold_exceededoperator_alerted_atblocked_or_limited_pathsacknowledged_byacknowledged_atDocument who reviewed the change, what evidence they inspected, and whether re-attestation or customer notification was completed.
Recommended fields:
review_requiredreviewer_name_or_rolereview_started_atreview_completed_atreview_decisionreview_notesre_attestation_scopeconformance_claim_updatedfoundation_model_disclosure_updatedcustomer_notification_statusSuggested review_decision values:
approved_for_deploymentapproved_with_limitsrequires_more_testingrejectedrollback_requiredPreserve the operational path back to the prior model configuration and make superseded records traceable.
Recommended fields:
rollback_plan_referencerollback_test_run_idrollback_test_resultprevious_record_idsupersedes_record_idsuperseded_by_record_idrollback_completed_atAttach or reference the artifacts customers and reviewers may request.
Recommended evidence:
record_id: mcdr-2026-0042
change_ticket_id: change-2026-117
record_type: planned_model_change
status: approved_for_deployment
created_at: 2026-05-01T09:00:00Z
last_updated_at: 2026-05-01T15:30:00Z
owner: platform-governance
reviewers:
- safety-reviewer-01
- customer-assurance-01
previous_model:
provider: example-ai-provider
model_family: example-secure-model
model_name: example-secure-model-4
model_version: 4.2.1
deployment_environment: production
region_or_endpoint: us.example.provider
inference_route: primary-agent-route
fallback_route: fallback-agent-route-v1
system_policy_reference: policy/sp-2026-03
tool_policy_reference: tools/tp-2026-03
behavioral_fingerprint_reference: bf-2026-03-15
candidate_model:
provider: example-ai-provider
model_family: example-secure-model
model_name: example-secure-model-4
model_version: 4.3.0
deployment_environment: production
region_or_endpoint: us.example.provider
inference_route: primary-agent-route
fallback_route: fallback-agent-route-v1
system_policy_reference: policy/sp-2026-03
tool_policy_reference: tools/tp-2026-03
behavioral_fingerprint_reference: bf-2026-05-01
change_summary:
change_type: model_version_update
reason_for_change: provider security and reliability update
materiality_decision: not_material
materiality_basis: same provider and model family; no action-space or refusal-behavior delta observed on safety-critical tests
affected_domains:
- APTS-AR-019
- APTS-TP-002
requires_re_attestation: false
customer_notification_required: false
behavioral_comparison:
test_set_reference: safety-critical-baseline-2026-04
baseline_run_id: eval-run-2026-04-30-a
candidate_run_id: eval-run-2026-05-01-b
scope_decision_delta: none_observed
escalation_decision_delta: none_observed
impact_classification_delta: minor_non_material_wording_change
manipulation_resistance_delta: none_observed
safety_control_delta: none_observed
summary_result: passed
evidence_references:
- evidence/model-change/eval-run-2026-05-01-b.json
- evidence/model-change/reviewer-notes-2026-05-01.md
drift_detection:
drift_alert_id: null
detected_at: null
detection_source: scheduled_pre_engagement_baseline
baseline_reference: bf-2026-03-15
observed_deviation: no_threshold_exceedance
affected_decision_paths: []
tolerance_threshold: no_safety_critical_decision_changes
threshold_exceeded: false
operator_alerted_at: null
blocked_or_limited_paths: []
acknowledged_by: safety-reviewer-01
acknowledged_at: 2026-05-01T14:20:00Z
human_review:
review_required: true
reviewer_name_or_role: safety-reviewer-01
review_started_at: 2026-05-01T14:00:00Z
review_completed_at: 2026-05-01T14:25:00Z
review_decision: approved_for_deployment
review_notes: Evaluation output showed no safety-critical decision changes beyond tolerance
re_attestation_scope: not_required
conformance_claim_updated: false
foundation_model_disclosure_updated: false
customer_notification_status: not_required
rollback:
rollback_plan_reference: runbooks/model-rollback-v2.md
rollback_test_run_id: rollback-test-2026-05-01
rollback_test_result: passed
previous_record_id: mcdr-2026-0038
supersedes_record_id: mcdr-2026-0038
superseded_by_record_id: null
rollback_completed_at: null
{
"record_id": "mcdr-2026-0042",
"record_type": "planned_model_change",
"status": "approved_for_deployment",
"previous_model": {
"provider": "example-ai-provider",
"model_name": "example-secure-model-4",
"model_version": "4.2.1",
"behavioral_fingerprint_reference": "bf-2026-03-15"
},
"candidate_model": {
"provider": "example-ai-provider",
"model_name": "example-secure-model-4",
"model_version": "4.3.0",
"behavioral_fingerprint_reference": "bf-2026-05-01"
},
"change_summary": {
"change_type": "model_version_update",
"materiality_decision": "not_material",
"requires_re_attestation": false
},
"behavioral_comparison": {
"test_set_reference": "safety-critical-baseline-2026-04",
"summary_result": "passed"
},
"human_review": {
"review_required": true,
"review_decision": "approved_for_deployment"
}
}
| Record area | Primary requirements |
|---|---|
| Pinned model identity and rollback data | APTS-TP-002 |
| Behavioral fingerprint and comparison results | APTS-AR-019, APTS-AR-017 |
| Provider-side drift detection and blocking decisions | APTS-AR-019 |
| Foundation model disclosure updates | APTS-TP-021, APTS-TP-022 |
| Re-attestation scope and customer notification | APTS-TP-022, APTS-AR-018 |
| Human review of safety-critical behavior changes | APTS-AR-019, APTS-MR-020 |
When reviewing a model change and drift record, consider asking:
This template is intentionally illustrative. Operators may keep equivalent records in change-management systems, model registries, ticketing systems, or governance platforms as long as the evidence is complete, reviewable, and available to customers when required by APTS.