OWASP HACTU8
Preparing for SkyNet: Tools for Responsbile AI
OWASP hactu8 builds on OWASP’s foundational IoT and LLM security projects to create an active security testing platform for robotics, IoT, and consumer electronics. It leverages existing resources like the OWASP IoT Top 10, OWASP IoT Security Testing Guide, and OWASP LLM projects to provide practical, hands-on tools for ethical hackers and security professionals.
The integration of generative AI (LLMs) transforms traditional security workflows by offering automated vulnerability detection, test script generation, and AI-powered exploitation guidance. Furthermore, the platform actively mitigates risks unique to LLMs (e.g., adversarial prompts, data exposure) by aligning with OWASP LLM security principles.
OWASP hactu8 serves as both a learning environment and a practical testing tool, enabling the community to collaboratively secure the future of robotics and IoT in the age of AI.
Road Map
Phase 1: Foundation and Integration
- Evaluate OWASP IoT and OWASP LLM projects to align methodologies and tools.
- Develop an initial proof-of-concept platform integrating OWASP IoT Security Testing Guide and OWASP LLM Security Project.
- Create a basic AI-powered vulnerability testing module.
Phase 2: Platform Development
- Build the full platform with: • Modular tools for fuzzing, firmware analysis, and API security testing. • Generative AI-driven testing and reporting capabilities.
- Add real-world test scenarios for robotics and IoT, with specific cases for cloud-hosted AI.
- Integrate OWASP LLM Security recommendations to address AI-specific risks.
Phase 3: Community Engagement and Expansion
- Release the platform as an open-source tool for community testing and contributions.
- Host workshops, webinars, and collaborative hackathons focused on securing LLM-integrated systems and IoT.
- Add API integrations to expand the platform’s extensibility with external tools.
Roadmap (As Of July 2025)
This roadmap outlines the development plan for the OWASP HACTU8 reference platform. Phase 1 focuses on building a foundational system to support AI assurance testing through UI scaffolding, test agent integration, registry design, and a lightweight scanner.
Project Phases
Phase | Name | Objective |
---|---|---|
1 | Foundation and Integration | Establish a running UI, test orchestration, extension architecture, and registry/discovery foundation |
2 | Platform Development | Expand into a fully functional MVP with integrated models, test automation, and extension marketplace |
3 | Community Engagement | Promote contribution, support external tools, and build adoption with documentation and collaboration models |
Phase 1: Foundation and Integration
M1: Mockup
Scaffold the UI, engine shell, scanner, and API endpoints for initial visualization.
- Scaffold Streamlit UI Shell
- Build FastAPI Interface Skeleton
- Create Engine Module Skeleton (agents/orchestrator)
- Create Scanner CLI Skeleton
M2: Requirements Documentation
Define all core schemas, specifications, and architectural references.
- Define Initial Architecture Document
- Create Test Spec Template (JSON/YAML)
- Document OWASP Top 10 Test Concepts
- Describe Signature Format for Scanners
- Define Registry API Contract
M3: Running Demo
Deliver a working vertical slice using mock data to demonstrate test execution flow.
- Wire UI to Dummy API
- Run Simulated Prompt Injection Agent
- Display Result in Assurance Viewer
- Integrate Registry View from API
M4: Extension Model
Implement a plugin system for test extensions, aligned to the OWASP LLM Top 10.
- Define Extension Plugin Base Class
- Implement Prompt Injection Extension Stub
- Define Extension Metadata Format
- Render Extension Output in UI
M5: Discovery Architecture
Define and implement how LLMs, tools, and endpoints are discovered or registered.
- Design Discovery Interface (API + check-in)
- Create Signature Loader Logic
- Connect CLI Scanner to Registry
- Simulate Local Ollama/Foundry Detection
- Document Discovery Modes and Architecture
Phase Closeout: Retrospective
Evaluate Phase 1 and scope Phase 2 deliverables.
- Conduct Phase 1 Retrospective
- Define Milestones for Phase 2: Platform Development
Diagrams
Component Diagram
graph TD
subgraph UI Layer
A[Streamlit UI]
A1[Workbench]
A2[Assurance Viewer]
A3[Registry Viewer]
A4[Extension Panel]
end
subgraph Services
S1[API Gateway / FastAPI]
S2[Registry Service]
S3[Identity / Auth]
end
subgraph Engine
E1[Orchestrator]
E2[Extension Loader]
E3[Agent Runner]
end
subgraph Extensions
X1[Prompt Injection Ext]
X2[RAG Poisoning Ext]
X3[Custom Test Ext]
end
subgraph Scanner CLI
C1[AI Port Scanner]
C2[Signature Loader]
end
%% UI to Services
A -->|REST| S1
A1 -->|Run Remote Test| S1
A2 -->|Fetch Results| S1
A3 -->|Get Registry| S2
A4 -->|Run Local Extension| X1
A4 -->|Run Local Extension| X2
A4 -->|Run Local Extension| X3
%% Services to Engine
S1 -->|Invoke| E1
E1 --> E2
E2 -->|Load| X1
E2 -->|Load| X2
E2 -->|Load| X3
E1 --> E3
%% Extensions to Agent Runner
X1 --> E3
X2 --> E3
X3 --> E3
%% Registry and Scanner
S2 --> E1
S2 --> C1
C1 --> C2
C2 --> S2
%% Identity
A --> S3
S3 --> S1