This appendix provides an illustrative template for documenting model changes, behavioral comparisons, drift detections, human review, and rollback decisions for autonomous penetration testing platforms. It is intended to help platform operators, customers, and reviewers collect evidence for existing APTS requirements. It does not create or modify any APTS requirement.

Purpose

APTS requires operators to pin model versions, track AI/ML model changes, validate behavior after changes, detect behavioral drift, and re-attest after material foundation model changes. Those activities often involve several teams and artifacts, so customers and reviewers benefit from one compact record that connects the change-management decision to the evidence that supports it.

Primary Use Cases

Design Principles

Recommended Template Sections

1. Record metadata

Use stable identifiers so the record can be linked to change tickets, conformance claims, drift alerts, and customer notifications.

2. Model configuration

Record both the prior and candidate model configurations. The exact version format may vary by provider, but the record should be specific enough for review and rollback.

3. Change summary and materiality

Explain what changed and whether the change is material under the operator's APTS-TP-022 policy.

4. Behavioral comparison

Capture the comparison between the previous model behavior and the candidate or observed model behavior, especially on safety-critical decisions.

5. Drift detection record

Use this section when the operator detects model behavior changes that were not introduced by an operator-controlled deployment.

6. Human review and re-attestation

Document who reviewed the change, what evidence they inspected, and whether re-attestation or customer notification was completed.

7. Rollback and supersession

Preserve the operational path back to the prior model configuration and make superseded records traceable.

8. Evidence checklist

Example YAML Template

record_id: mcdr-2026-0042
change_ticket_id: change-2026-117
record_type: planned_model_change
status: approved_for_deployment
created_at: 2026-05-01T09:00:00Z
last_updated_at: 2026-05-01T15:30:00Z
owner: platform-governance
reviewers:
  - safety-reviewer-01
  - customer-assurance-01

previous_model:
  provider: example-ai-provider
  model_family: example-secure-model
  model_name: example-secure-model-4
  model_version: 4.2.1
  deployment_environment: production
  region_or_endpoint: us.example.provider
  inference_route: primary-agent-route
  fallback_route: fallback-agent-route-v1
  system_policy_reference: policy/sp-2026-03
  tool_policy_reference: tools/tp-2026-03
  behavioral_fingerprint_reference: bf-2026-03-15

candidate_model:
  provider: example-ai-provider
  model_family: example-secure-model
  model_name: example-secure-model-4
  model_version: 4.3.0
  deployment_environment: production
  region_or_endpoint: us.example.provider
  inference_route: primary-agent-route
  fallback_route: fallback-agent-route-v1
  system_policy_reference: policy/sp-2026-03
  tool_policy_reference: tools/tp-2026-03
  behavioral_fingerprint_reference: bf-2026-05-01

change_summary:
  change_type: model_version_update
  reason_for_change: provider security and reliability update
  materiality_decision: not_material
  materiality_basis: same provider and model family; no action-space or refusal-behavior delta observed on safety-critical tests
  affected_domains:
    - APTS-AR-019
    - APTS-TP-002
  requires_re_attestation: false
  customer_notification_required: false

behavioral_comparison:
  test_set_reference: safety-critical-baseline-2026-04
  baseline_run_id: eval-run-2026-04-30-a
  candidate_run_id: eval-run-2026-05-01-b
  scope_decision_delta: none_observed
  escalation_decision_delta: none_observed
  impact_classification_delta: minor_non_material_wording_change
  manipulation_resistance_delta: none_observed
  safety_control_delta: none_observed
  summary_result: passed
  evidence_references:
    - evidence/model-change/eval-run-2026-05-01-b.json
    - evidence/model-change/reviewer-notes-2026-05-01.md

drift_detection:
  drift_alert_id: null
  detected_at: null
  detection_source: scheduled_pre_engagement_baseline
  baseline_reference: bf-2026-03-15
  observed_deviation: no_threshold_exceedance
  affected_decision_paths: []
  tolerance_threshold: no_safety_critical_decision_changes
  threshold_exceeded: false
  operator_alerted_at: null
  blocked_or_limited_paths: []
  acknowledged_by: safety-reviewer-01
  acknowledged_at: 2026-05-01T14:20:00Z

human_review:
  review_required: true
  reviewer_name_or_role: safety-reviewer-01
  review_started_at: 2026-05-01T14:00:00Z
  review_completed_at: 2026-05-01T14:25:00Z
  review_decision: approved_for_deployment
  review_notes: Evaluation output showed no safety-critical decision changes beyond tolerance
  re_attestation_scope: not_required
  conformance_claim_updated: false
  foundation_model_disclosure_updated: false
  customer_notification_status: not_required

rollback:
  rollback_plan_reference: runbooks/model-rollback-v2.md
  rollback_test_run_id: rollback-test-2026-05-01
  rollback_test_result: passed
  previous_record_id: mcdr-2026-0038
  supersedes_record_id: mcdr-2026-0038
  superseded_by_record_id: null
  rollback_completed_at: null

JSON-Equivalent Structure

{
  "record_id": "mcdr-2026-0042",
  "record_type": "planned_model_change",
  "status": "approved_for_deployment",
  "previous_model": {
    "provider": "example-ai-provider",
    "model_name": "example-secure-model-4",
    "model_version": "4.2.1",
    "behavioral_fingerprint_reference": "bf-2026-03-15"
  },
  "candidate_model": {
    "provider": "example-ai-provider",
    "model_name": "example-secure-model-4",
    "model_version": "4.3.0",
    "behavioral_fingerprint_reference": "bf-2026-05-01"
  },
  "change_summary": {
    "change_type": "model_version_update",
    "materiality_decision": "not_material",
    "requires_re_attestation": false
  },
  "behavioral_comparison": {
    "test_set_reference": "safety-critical-baseline-2026-04",
    "summary_result": "passed"
  },
  "human_review": {
    "review_required": true,
    "review_decision": "approved_for_deployment"
  }
}

Field Mapping to APTS Requirements

Validation Guidance for Customers and Reviewers

Usage Notes

This template is intentionally illustrative. Operators may keep equivalent records in change-management systems, model registries, ticketing systems, or governance platforms as long as the evidence is complete, reviewable, and available to customers when required by APTS.

Record area	Primary requirements
Pinned model identity and rollback data	`APTS-TP-002`
Behavioral fingerprint and comparison results	`APTS-AR-019`, `APTS-AR-017`
Provider-side drift detection and blocking decisions	`APTS-AR-019`
Foundation model disclosure updates	`APTS-TP-021`, `APTS-TP-022`
Re-attestation scope and customer notification	`APTS-TP-022`, `APTS-AR-018`
Human review of safety-critical behavior changes	`APTS-AR-019`, `APTS-MR-020`

Model Change and Drift Record Template