Domain Prefix: APTS-TP | Requirements: 22
This domain defines how an autonomous penetration testing platform selects, vets, and governs the third parties and supply-chain components it depends on: AI and LLM providers, cloud services, vulnerability intelligence feeds, SaaS tools, AI model provenance, and the handling, isolation, and eventual destruction of client data that flows through those dependencies. Supply Chain Trust exists because an autonomous pentest platform is almost never a self-contained artifact. Its decisions are shaped by upstream model updates, its evidence is often stored in third-party infrastructure, and its blast radius extends to whatever data those dependencies can see. A failure in an upstream provider (compromise, silent model change, retention violation, tenant bleed) can undo every other safety control the platform implements. Requirements in this domain govern provider vetting, model version pinning, API security, Service Level Agreement (SLA) and failover, breach notification, dependency inventory, residency and sovereignty, cloud hardening, vulnerability-feed governance, data classification and encryption, retention and certified destruction, multi-tenant isolation, tenant-breach notification, AI model provenance, and persistent memory and retrieval state.
This domain covers the trustworthiness and governance of the platform's upstream dependencies. Target-facing scope enforcement belongs to Scope Enforcement (SE), blast-radius classification to Safety Controls (SC), and the audit trail of provider events to Auditability (AR).
For implementation guidance, see the Implementation Guide.
The 22 requirements in this domain fall into seven thematic groups:
| Group | Requirements | Purpose |
|---|---|---|
| Provider vetting and governance | APTS-TP-001, APTS-TP-002, APTS-TP-003, APTS-TP-004, APTS-TP-005 | Provider selection and vetting, model version pinning, API security, availability and SLA, provider incident response |
| Dependency inventory and cloud security | APTS-TP-006, APTS-TP-007, APTS-TP-008, APTS-TP-009 | Dependency inventory and supply-chain verification, data residency and sovereignty, cloud configuration and hardening, incident response and service continuity |
| Vulnerability intelligence feeds | APTS-TP-010, APTS-TP-011 | Feed selection and management, feed quality assurance and incident response |
| Client data classification, encryption, and retention | APTS-TP-012, APTS-TP-013, APTS-TP-014, APTS-TP-015, APTS-TP-016 | Client data classification, sensitive data discovery and handling, encryption and cryptographic controls, retention and secure deletion, destruction proof and certification |
| Multi-tenant isolation and tenant-breach notification | APTS-TP-017, APTS-TP-018 | Multi-tenant and engagement isolation, tenant breach notification |
| AI model provenance and persistent state | APTS-TP-019, APTS-TP-020 | Model provenance and training-data governance, persistent memory and retrieval-state governance |
| Foundation model disclosure and re-attestation | APTS-TP-021, APTS-TP-022 | Foundation model disclosure and capability baseline, re-attestation on material foundation model change |
| ID | Title | Classification |
|---|---|---|
| APTS-TP-001 | Third-Party Provider Selection and Vetting | MUST | Tier 1 |
| APTS-TP-002 | Model Version Pinning and Change Management | MUST | Tier 2 |
| APTS-TP-003 | API Security and Authentication | MUST | Tier 1 |
| APTS-TP-004 | Provider Availability, SLA Management, and Failover | MUST | Tier 2 |
| APTS-TP-005 | Provider Incident Response, Breach Notification, and Mid-Engagement Compromise | MUST | Tier 1 |
| APTS-TP-006 | Dependency Inventory, Risk Assessment, and Supply Chain Verification | MUST | Tier 1 |
| APTS-TP-007 | Data Residency and Sovereignty Requirements | SHOULD | Tier 2 |
| APTS-TP-008 | Cloud Security Configuration and Hardening | MUST | Tier 1 |
| APTS-TP-009 | Incident Response and Service Continuity Planning | MUST | Tier 2 |
| APTS-TP-010 | Vulnerability Feed Selection and Management | MUST | Tier 2 |
| APTS-TP-011 | Feed Quality Assurance and Incident Response | SHOULD | Tier 2 |
| APTS-TP-012 | Client Data Classification Framework | MUST | Tier 1 |
| APTS-TP-013 | Sensitive Data Discovery and Handling | MUST | Tier 1 |
| APTS-TP-014 | Data Encryption and Cryptographic Controls | MUST | Tier 1 |
| APTS-TP-015 | Data Retention and Secure Deletion | MUST | Tier 2 |
| APTS-TP-016 | Data Destruction Proof and Certification | MUST | Tier 3 |
| APTS-TP-017 | Multi-Tenant and Engagement Isolation | MUST | Tier 2 |
| APTS-TP-018 | Tenant Breach Notification | MUST | Tier 1 |
| APTS-TP-019 | AI Model Provenance and Training Data Governance | MUST | Tier 2 |
| APTS-TP-020 | Persistent Memory and Retrieval State Governance | SHOULD | Tier 2 |
| APTS-TP-021 | Foundation Model Disclosure and Capability Baseline | MUST | Tier 1 |
| APTS-TP-022 | Re-attestation on Material Foundation Model Change | MUST | Tier 2 |
A platform claims conformance with this domain by satisfying all MUST requirements at the compliance tier it targets. APTS defines three cumulative compliance tiers (Tier 1 Foundation, Tier 2 Verified, Tier 3 Comprehensive) in the Introduction; a Tier 2 platform satisfies every Tier 1 TP requirement plus every Tier 2 TP requirement, and a Tier 3 platform satisfies all three tiers. SHOULD-level requirements are interpreted per RFC 2119.
Three appendix-only advisory requirements for this domain (APTS-TP-A01 Breach Notification and Regulatory Reporting, APTS-TP-A02 Privacy Regulation Compliance, APTS-TP-A03 Professional Liability and Engagement Agreements) are documented in the Advisory Requirements appendix. They are not required for conformance at any tier.
Every requirement in this domain includes a Verification subsection listing the verification procedures a reviewer uses to confirm implementation.
Classification: MUST | Tier 1
Applicability: All platforms using external service providers (AI/LLM providers, SaaS platforms, infrastructure services).
Organizations MUST establish and document a vetting process for all external service providers before their use in autonomous pentest operations. The vetting process MUST assess security posture (SOC 2 Type II, recent penetration tests, incident history), data protection practices (encryption standards, retention policies, residency options, multi-tenancy controls), service reliability (Service Level Agreements (SLAs), disaster recovery, geographic redundancy), contractual terms (DPA, liability, breach notification per applicable regulatory timelines, audit rights, termination procedures), and compliance certifications (ISO 27001, SOC 2 Type II, GDPR, industry-specific). All deployed third-party services MUST have documented evaluation and approval.
AI/LLM Provider-Specific Vetting:
AI/LLM service providers MUST receive additional vetting covering model training data governance, data isolation between customers, prompt/response logging policies, and model versioning practices.
Non-AI SaaS and Infrastructure Services:
Non-AI SaaS platforms used in pentest operations (log storage, dashboards, analytics, collaboration tools) MUST be evaluated against the same core criteria. Compliance certifications MUST be verified for each platform, and data handling practices MUST be assessed before use.
Classification: MUST | Tier 2
Organizations MUST implement version pinning for all AI models to enable change tracking, testing, and rollback. All models MUST have explicit version identifiers (semantic versioning). Specific versions MUST be specified in deployment requirements rather than using "latest." Model upgrades MUST follow a formal change management process with testing before production deployment. Rollback procedures MUST enable reversion to previous versions if issues are detected.
See also: APTS-AR-019 (AI/ML model change tracking and drift detection), APTS-TP-019 (AI model provenance and training data governance)
Classification: MUST | Tier 1
All API calls to external providers MUST use strong authentication and encryption. Authentication MUST include secure credential storage (never in code or configuration), API key rotation on defined schedule, and monitoring for unusual API usage patterns. All communications MUST enforce TLS 1.2 minimum (TLS 1.3 preferred) with certificate validation and no downgrades. Mutual TLS MUST be used for service-to-service authentication where supported.
See also: APTS-TP-014 (secrets/credential management for provider API keys), APTS-SE-023 (credential and secret lifecycle governance for provider API keys).
Classification: MUST | Tier 2
Organizations MUST establish availability requirements for critical providers and implement SLA management with failover procedures to ensure operations continue during provider outages. Provider selection MUST include SLA verification (minimum 99.5% uptime). SLA requirements MUST be established and documented for cloud services, specifying uptime percentages, maximum latency, and response times for support. Service contracts MUST include explicit SLA clauses with penalties for breaches. Monitoring systems MUST continuously check provider availability with automated alerts for failures, with metrics tracked for uptime percentage and latency. SLA breaches MUST be recorded and remediation actions documented. Failover procedures MUST be documented and tested, with fallback providers identified for critical services.
See also: APTS-TP-009 (business continuity and provider exit strategy), APTS-SC-018 (incident containment and recovery procedures that activate during provider failover).
Classification: MUST | Tier 1
Organizations MUST establish incident response procedures for provider-related security incidents, including procedures specific to compromises discovered during active engagements. Provider contracts MUST specify breach notification obligations with timelines aligned to applicable regulatory requirements. Organizations MUST maintain incident response procedures documenting investigation steps, data impact assessment, recovery procedures, and communication protocols with affected customers. Procedures MUST enable rapid assessment of which organization data was accessed or compromised.
During Active Engagements: When a provider compromise is discovered during an active engagement, response procedures MUST enable rapid containment of customer data exposure, assessment of engagement continuation feasibility, and prompt customer notification per the platform's incident response plan. For each engagement, the organization MUST document whether it can continue with the compromised provider or MUST cease operations pending remediation. Procedures MUST include evidence preservation for forensic investigation.
See also: APTS-TP-018 (tenant breach notification obligations triggered by provider compromise), APTS-TP-009 (business continuity planning for mid-engagement provider outage).
Classification: MUST | Tier 1
Organizations MUST maintain complete inventory of all software dependencies and third-party components used in autonomous pentest operations (Software Bill of Materials (SBOM)). Inventory MUST include version numbers, licensing information, and known vulnerabilities. Risk assessment MUST be performed for each dependency, evaluated against criticality of its use in operations. Vulnerability monitoring MUST be continuous, with processes to identify and track new vulnerabilities in dependencies as well as abandoned or unmaintained packages. Update strategies MUST balance security patching with testing requirements.
Supply Chain Integrity Verification:
The platform MUST verify the integrity of all software dependencies by verifying package signatures and checksums against known-good sources, and detecting potential typosquatting or dependency confusion attacks. Dependency updates MUST be validated in a non-production environment before deployment.
See also: APTS-AR-016 (platform integrity and supply chain attestation)
Classification: SHOULD | Tier 2
Applicability: Platforms deployed in cloud or multi-jurisdiction environments.
Platforms SHOULD document data residency requirements and verify that provider configurations enforce jurisdictional constraints. The specific technical controls for residency enforcement are platform-determined.
See also: APTS-TP-012 (data classification that determines residency obligations), APTS-AR-015 (evidence classification that drives jurisdictional handling).
Classification: MUST | Tier 1
Applicability: Platforms deployed on cloud infrastructure.
Organizations MUST implement cloud security hardening for all cloud infrastructure supporting autonomous pentest operations. Configuration MUST follow principle of least privilege for identity and access management (IAM) policies. All data MUST be encrypted at rest using strong algorithms (AES-256 minimum). Network configuration MUST enforce security groups and network ACLs to restrict traffic to necessary flows only. Regular compliance scanning MUST identify deviations from hardening standards and trigger remediation.
See also: APTS-SE-024 (cloud-native scope governance that this hardening baseline supports), APTS-TP-014 (secrets management inside the hardened cloud environment).
Classification: MUST | Tier 2
Organizations MUST establish incident response procedures and service continuity plans for cloud infrastructure supporting pentest operations. Plans MUST define Recovery Time Objective (RTO) and Recovery Point Objective (RPO) targets. Regular backups MUST be performed with documented schedule and verification that backups can be restored. Disaster recovery procedures MUST be tested regularly with results documented.
See also: APTS-TP-004 (provider availability/SLA planning that feeds RTO/RPO targets), APTS-TP-005 (provider incident response procedures integrated with continuity plans).
Classification: MUST | Tier 2
Organizations MUST establish criteria for selecting vulnerability feeds and manage feed updates to ensure vulnerability data is current, accurate, and applicable to active scanning operations. Feed selection MUST assess Common Vulnerabilities and Exposures (CVE) coverage, update frequency, data quality (Common Vulnerability Scoring System (CVSS) accuracy, Common Weakness Enumeration (CWE) mappings, version identification), vendor longevity, and availability/SLA commitments. Feeds MUST have cryptographic verification. Feed testing MUST validate data quality in test environments before production deployment. Update frequency MUST be proportional to severity, with critical vulnerabilities prioritized for fastest integration. Updates SHOULD be applied during maintenance windows and validated in a non-production environment before deployment. Change management MUST track significant feed modifications and assess impact on finding accuracy.
See also: APTS-TP-011 (feed quality assurance and correlation that validates feed integrity), APTS-TP-006 (dependency/SBOM verification that shares cryptographic-trust processes).
Classification: SHOULD | Tier 2
Organizations SHOULD implement correlation processes to validate vulnerability feeds against discovered services and establish metrics to measure and improve feed accuracy. Correlation MUST query feed(s) for matching CVEs per discovered service, filter by applicability (version range, OS, architecture), assign confidence scores based on version exactness, and document applicable CVEs. Multi-feed correlation MUST compare results across sources and identify/resolve discrepancies in CVSS scoring and affected versions. Accuracy validation MUST maintain test datasets of known vulnerable systems with documented false negative measurement methodology with tracked improvement targets. Organizations SHOULD establish measurable quality targets (for example, false positive rate targets, version match accuracy targets, CVE coverage targets) and track them over time. Absolute false negative rates are difficult to measure with precision. The requirement is for establishing and documenting a measurement methodology, not for achieving a specific false negative rate. Organizations MUST document known limitations of their measurement approach. Incident procedures MUST include detection methods (manual review, automated quality checks, client feedback, comparative analysis, spot-checking, vendor advisories), issue classification with documented response timelines per severity level, investigation/remediation process with vendor coordination, and client communication for high-impact issues (notification per the platform's documented SLA for feed quality incidents).
See also: APTS-TP-010 (feed selection and management whose data this QA process validates), APTS-RP-006 (false positive rate disclosure fed by feed-quality metrics).
Classification: MUST | Tier 1
Organizations MUST establish a data classification framework to categorize client data by sensitivity and apply protection controls based on classification level as defined below. Framework MUST define four levels: RESTRICTED (credentials, encryption keys, PII, payment card data, health information, trade secrets - never transmit externally, minimum AES-256-GCM encryption), CONFIDENTIAL (findings, diagrams, source code, contact information, financial records, policies - limit external transmission, require DPA, TLS 1.2+ encryption), INTERNAL (engagement scope, business context, infrastructure details, status information, methodology - limit to authorized personnel, encryption at rest recommended), PUBLIC (process information, de-identified descriptions, aggregate statistics, published guidance - no special protection required). Classification MUST occur before data creation or receipt, be documented in engagement files, and be reviewed at engagement closure. Handling rules MUST specify external transmission permissions, encryption requirements in transit and at rest, access control levels, retention periods, and deletion methods for each classification. Organizations MUST implement data minimization before transmitting data to external AI providers: strip unnecessary metadata, remove/redact discovered credentials and API keys, anonymize IP addresses and domain names where possible, exclude non-essential client context, and use hash values instead of full data when feasible. Client engagement letters MUST disclose specific AI providers used, data categories sent to each provider, and obtain explicit client consent before transmitting classified data.
Clarification: 'Never transmit externally' for RESTRICTED data refers to raw credentials, encryption keys, and unredacted PII. Processed and minimized data (with credentials redacted, PII anonymized, keys removed) may be transmitted to approved AI providers under a Data Processing Agreement.
See also: APTS-AR-015 (data/evidence classification), APTS-TP-013 (sensitive data discovery and handling using this classification framework), APTS-RP-015 (downstream finding pipeline integrity and data redaction)
Classification: MUST | Tier 1
Organizations MUST implement procedures to identify, protect, and report discovered sensitive data (credentials, PII, financial data, health information) found during autonomous pentest operations. Sensitive data types include credentials (passwords, API keys, tokens, SSH keys), PII (SSN, passport numbers, driver's licenses, email addresses, phone numbers), payment card information (PAN, cardholder names, CVV, bank accounts), and health information (medical records, diagnoses). Discovery MUST use automated pattern matching for known formats (credit card numbers, SSNs) and manual analysis by security analysts. Upon discovery, sensitive data MUST be isolated, location documented, exposure risk assessed, and access controls reviewed. Storage MUST use secure vaults (encrypted, access-restricted) separate from general findings with audit logging of access. Reporting MUST create RESTRICTED findings without disclosing actual values (for example, "CC # ending in 4532" instead of full number), provide description of exposure impact, recommend remediation, assess risk level, and document reasoning. Client notification MUST occur immediately (phone call recommended) with guidance on secure deletion and verification of client remediation. Regulatory compliance MUST assess GDPR, CCPA, HIPAA, PIPEDA, PCI-DSS, and state law requirements, and MUST issue the notifications required by any applicable regulation within the timelines those regulations specify.
Build a seeded corpus containing a known set of synthetic sensitive items and run the discovery procedure against it in a staging environment. The corpus MUST contain at minimum:
Verify every one of the following:
See also: APTS-MR-019 (discovered credential protection with immediate encryption), APTS-TP-012 (client data classification framework defining the classification levels), APTS-TP-014 (encryption controls), APTS-TP-015 (retention boundaries)
Classification: MUST | Tier 1
Organizations MUST implement cryptographic controls to protect client data in transit and at rest using modern encryption standards. Encryption in transit MUST enforce minimum TLS 1.2 (TLS 1.3 preferred) with strong cipher suites only (no weak/export ciphers), certificate validation (no self-signed in production), certificate pinning for critical systems (recommended), and mutual TLS for service-to-service authentication. All API calls, client-server communication, and internal service communication MUST use TLS 1.2+, VPN for cloud infrastructure access, and TLS encryption for email (STARTTLS or implicit). Encryption at rest MUST use minimum AES-256-GCM with centralized key vault (not embedded in code), key rotation annually minimum (90 days for highly sensitive data), and optional multi-layer encryption for ultra-sensitive data. Key management MUST never embed keys in code or configuration, implement key escrow for recovery scenarios, maintain key inventory, restrict access via role-based controls, and audit all key operations. Approved algorithms include symmetric AES (128-bit minimum, 256-bit preferred), asymmetric RSA-2048+ or ECDSA-256+, hashing SHA-256 minimum, and key derivation PBKDF2/bcrypt/scrypt. Deprecated algorithms (DES, 3DES, RC4, MD5, SHA-1, RSA-1024) MUST NOT be used.
Classification: MUST | Tier 2
Organizations MUST establish and document a data retention policy that specifies retention periods per data classification, with justification based on operational need, contractual obligation, and applicable regulatory requirements. Example baseline periods (findings: 12 months, raw scan data: 6 months, credentials: 90 days maximum) may serve as starting points but MUST be adjusted for engagement-specific and jurisdictional requirements. Retention policy MUST address: findings and reports (CONFIDENTIAL) with documented retention period and justification; raw scan data (CONFIDENTIAL) with documented retention period; credentials (RESTRICTED) with maximum retention of 90 days; logs and audit trails (INTERNAL) with documented retention reflecting compliance requirements; backup data retention matching source data; data subject deletion requests (GDPR/CCPA) with documented compliance procedures. Retention tracking MUST maintain inventory with expiration dates, automated reminders 30 days before deletion, regular compliance audits, documented exceptions, and client approval for extensions. Secure deletion MUST use the method specified for the storage medium: for magnetic media, multi-pass overwriting (at least three passes) or NIST SP 800-88 Rev. 1 Clear/Purge; for solid-state storage, cryptographic erasure of the encryption key or manufacturer-certified secure erase (NIST SP 800-88 Purge); for cloud storage, the provider's documented secure deletion API combined with key destruction for any customer-managed keys. The specific deletion method MUST be documented. Deletion verification MUST confirm primary storage deletion, backup deletion, archival deletion, test recovery (expected to fail), and document deletion with evidence.
See also: APTS-TP-016 (data destruction proof and certification)
Classification: MUST | Tier 3
Upon engagement completion, the platform MUST generate a data destruction certificate documenting all data destroyed, destruction methods used, timestamps, and cryptographic verification of destruction. Destruction schedule MUST specify retention periods per data classification level, documented in the platform's data retention policy. Example classifications include: engagement logs (retain per policy, secure wipe or crypto-shred), vulnerability findings (until customer confirms receipt and remediation, per contractual agreement), raw evidence (per engagement agreement, using cryptographic erasure), and customer credentials (destroyed promptly after engagement end, crypto-shred required). Destruction certificates MUST include, at minimum: data identifiers (record IDs or scope descriptors for each data set destroyed), deletion method used (for example, crypto-shred, secure wipe, physical destruction), timestamp of destruction, responsible party, and a cryptographic hash of the destruction log. The specific certificate format is platform-determined, but all listed elements MUST be present. Customer-initiated destruction MUST support a request mechanism (API or portal), acknowledgment within the timeframe defined in the platform's data retention policy, processing within the documented destruction SLA, and a verification certificate upon completion. Independent verification SHOULD be performed using a separate system from the destruction process, attempting retrieval for each category: file system access (expected to return "file not found"), database queries (expected to return zero rows), decryption attempts (expected to fail with cryptographic error). When performed, verification SHOULD document success/failure per category with timestamp, include recovery test (expected to fail), and append results to certificate. Where applicable data protection regulations provide data subject rights to erasure or deletion (for example, GDPR, CCPA, PIPEDA, HIPAA), the platform MUST support those rights within the timeframes required by each applicable regulation.
See also: APTS-TP-015 (data retention and secure deletion procedures)
Classification: MUST | Tier 2
Applicability: Multi-tenant platforms or platforms serving multiple concurrent engagements.
Each customer engagement MUST run in an isolated execution environment with separation of memory, storage, network resources, and credentials. Data from different clients MUST be segregated with no cross-contamination possible through accidental access patterns. Client findings MUST never be visible to other clients, and credentials MUST NOT be accessible across engagements.
Isolation controls MUST include: (1) Execution environment isolation using separate containers, virtual machines, or namespaces per engagement. (2) Database-level isolation using separate schemas per client (not just tables) with row-level security policies. (3) Storage isolation using separate encrypted volumes or directories per engagement with unique encryption keys. (4) Network isolation using policies that prevent inter-engagement communication. (5) Credential segregation using per-engagement encrypted vaults with access control bound to engagement ID. (6) Access control enforcing role-based access by client with logging of all access attempts. (7) Log and evidence segregation in separate access-controlled repositories per engagement.
AI model context isolation (Tier 3): new session per engagement with cleared context window, separate conversation threads, and fine-tuning prohibited without explicit written consent.
Testing and verification MUST include periodic isolation testing (at minimum annually, or after significant infrastructure changes) covering cross-client access testing, cross-engagement access attempts, credential vault access tests, and data leak testing.
Tiered Implementation: Application-layer isolation (separate database schemas, row-level security, application-enforced access controls) is expected at Tier 2. Infrastructure-layer isolation (dedicated VMs/containers per engagement, separate KMS policies, network namespace isolation) is expected at Tier 3 maturity or when required by regulatory obligations.
See also: APTS-SE-011 (tenant isolation in the target environment), APTS-MR-021 (adversarial testing of these isolation controls).
Classification: MUST | Tier 1
If the platform operator discovers that engagement isolation was breached (data leaked between tenants), all affected customers MUST be notified promptly per the platform's incident response plan and applicable regulatory requirements. Notification MUST include what data was exposed, the time window of exposure, and remediation steps taken. This requirement addresses cross-tenant data exposure specifically. For general breach notification and regulatory reporting, see APTS-TP-A01 (Advisory).
Breach Definition: Cross-engagement data leakage (Engagement A data accessed by Engagement B), isolation bypass (infrastructure penetration or misconfiguration), unintended operator access, or exposure window (from breach occurrence until detection).
Detection: Procedures MUST include automated monitoring for cross-engagement access attempts, periodic audit log analysis for suspicious patterns, network TAP analysis for isolation verification, periodic isolation penetration testing, and customer escalation.
Breach Assessment: The operator MUST promptly confirm breach occurrence and isolate affected systems, identify scope (engagements involved, data categories exposed, record/byte quantities, customer count), establish timeline (occurrence date/time, isolation mechanism deployment, exposure window duration), determine preliminary root cause (configuration error, software vulnerability, operator error, hardware failure), and assign severity classification:
Immediate Containment: The operator MUST isolate affected systems (disconnect if necessary), stop cross-engagement access, preserve evidence (logs, audit trail, configuration), and not delete evidence.
Customer Notification: The operator MUST identify all affected customers, notify using appropriate channels based on severity (for example, phone for CRITICAL, email for others), and include: brief description, data exposed, timeline, affected engagement IDs, impact assessment, remediation taken, recommended customer actions, point of contact, and next steps.
Forensic Investigation: The operator MUST conduct a full technical investigation, determine exactly what data was exposed and when, identify all involved parties and access method, document all evidence, and determine if other systems were affected.
Post-Incident Actions: The operator MUST conduct a post-mortem with engineering/leadership, implement preventive measures, re-architect isolation if necessary, audit past engagements for similar exposure, and plan customer compensation.
Regulatory Reporting: The operator MUST assess applicable regulations (for example, GDPR, CCPA) for applicability, notify data protection authorities if required, and document the notification timeline.
Run a simulated cross-tenant exposure exercise in a staging environment and verify every one of the following artifacts is produced and satisfies the criteria listed:
See also: APTS-TP-017 (multi-tenant isolation requirement whose failure triggers this notification duty), APTS-AR-018 (customer notification for behavior-affecting updates in the Auditability domain).
Classification: MUST | Tier 2
Applicability: Platforms using AI/ML models for decision-making.
The platform MUST maintain documentation of AI model provenance including: model source and supplier, training data categories (not the data itself, but descriptions of data types and sources), and fine-tuning history including whether customer engagement data was used. For third-party AI providers, the platform MUST request and retain provider attestations regarding data handling practices, including whether customer data is used for model training. Platforms MUST NOT use findings or data from one customer's engagement to train or fine-tune models used in another customer's engagement without documented consent.
Where direct verification of training data provenance is infeasible (for example, third-party LLM providers), the platform MUST obtain and retain written attestations from the provider covering: (a) training data sources and filtering methodology, (b) whether customer data is used for training, (c) data retention and deletion policies, and (d) model update notification procedures. Attestations MUST be refreshed annually or upon model version change, whichever is sooner.
See also: APTS-AR-019 (AI/ML model change tracking and drift detection), APTS-TP-002 (model version pinning and change management)
Classification: SHOULD | Tier 2
Platforms that maintain persistent state across engagements (including but not limited to embedding stores, retrieval-augmented generation (RAG) indexes, learned heuristics, rejection memory, cached target context, or planner state) SHOULD enforce governance controls over this persistent state:
See also: APTS-TP-017 (multi-tenant isolation extended to persistent state stores), APTS-MR-019 (discovered credential protection applied to persistent memory that may contain secrets).
Classification: MUST | Tier 1
Applicability: All platforms whose autonomous decision-making is powered by one or more foundation models (LLMs, vision-language models, or other large pre-trained models), whether hosted by a third-party provider or run by the operator.
The platform operator MUST disclose, in the platform's conformance documentation and to customers on request, the foundation model or models powering the agent's decision-making. The disclosure MUST include at minimum: (1) the model provider, (2) the model family name, (3) the specific model version or identifier the platform is pinned to at the time of the disclosure, (4) the model's publication or release date as stated by the provider, and (5) any fine-tuning, post-training, adapter, system-prompt pre-load, or tool-use configuration the operator applies on top of the base model.
The operator MUST maintain documentation of the model's known offensive capability baseline at the time of adoption. This baseline MUST draw on at least one of: (a) publicly available cyber capability benchmarks published by the provider or an independent evaluator, (b) the provider's own safety or dangerous-capability evaluations where published, or (c) the operator's own evaluation conducted during vetting under APTS-TP-001. The baseline is not a guarantee of capability bounds; it is a reference point so that customers, reviewers, and the operator can reason about how the agent's capability compares to the platform's safety-control envelope.
Disclosure MUST be refreshed whenever any of the five disclosure elements above changes. Material changes trigger the re-attestation requirement in APTS-TP-022. The platform's conformance claim MUST include the foundation model disclosure.
Foundation model choice is the single largest determinant of an autonomous pentest platform's capability surface. Two platforms that score identically against APTS safety-control requirements but run on very different foundation models can produce very different outcomes in the field, because the safety-control envelope is sized relative to assumptions about what the agent will attempt. Customers cannot reason about residual risk without knowing what model is inside the platform. Providers update and deprecate models frequently, and a platform that was vetted against Model A may behave differently when silently migrated to Model B. Explicit disclosure and a pinned version make this visible.
See also: APTS-TP-001 (provider vetting), APTS-TP-002 (model version pinning), APTS-TP-019 (model provenance and training data governance), APTS-TP-022 (re-attestation on material foundation model change)
Classification: MUST | Tier 2
Applicability: All platforms subject to APTS-TP-021.
When the operator makes a material change to the foundation model powering the agent, the operator MUST re-run the APTS conformance assessment covering at minimum the Scope Enforcement (SE), Safety Controls (SC), Manipulation Resistance (MR), and Graduated Autonomy (AL) domains, and MUST update the conformance claim and the APTS-TP-021 disclosure, before the changed platform handles customer engagements.
A change is material if any of the following apply: (1) the provider is different, (2) the model family is different, (3) the version change crosses a generation boundary as defined by the provider (for example, from a 3.x family to a 4.x family), (4) fine-tuning, adapters, or post-training configuration change in a way that affects the agent's available action space, offensive capability, or refusal behavior, or (5) the operator's own evaluation, a provider advisory, or a published benchmark indicates a meaningful shift in the model's offensive capability relative to the previously documented baseline.
The re-assessment MUST document what was re-tested, what the results were, and which controls, thresholds, or allowlists (if any) were adjusted as a result. Operators MUST notify customers with active engagements of the material change and the outcome of the re-assessment within the timeframe documented in the operator's change notification policy, and MUST preserve the prior configuration and disclosure for rollback under APTS-TP-002.
Foundation model changes are not routine dependency updates. A model upgrade can widen or narrow the agent's effective action space, change the distribution of actions it chooses from a fixed allowlist, or shift its behavior under adversarial prompts in ways that APTS safety controls were not previously sized against. Treating a model upgrade as equivalent to a library bump is the failure mode this requirement exists to prevent. Re-attestation forces the operator to confirm that the controls still hold against the new capability surface before customers are exposed to it.
See also: APTS-TP-001 (provider vetting), APTS-TP-002 (model version pinning and change management), APTS-TP-021 (foundation model disclosure and capability baseline), APTS-AR-019 (continuous model change tracking and drift detection).