OWASP Machine Learning Security Top Ten

📌 Important Information

The current version of this work is in draft and is being modified frequently. Please refer to the project wiki for information on how to contribute and project release timelines.

Overview

Welcome to the repository for the OWASP Machine Learning Security Top 10 project! The primary aim of the OWASP Machine Learning Security Top 10 project is to deliver an overview of the top 10 security issues of machine learning systems. More information on the project scope and target audience is available in our project working group charter

Top 10 Machine Learning Security Risks

Communication

Contribution

The initial version of the Machine Learning Security Top 10 list was contributed by Sagar Bhure and Shain Singh. The project encourages community contribution and aims to produce a high quality deliverable reviewed by industry peers.

All contributors will need to adhere to the project’s code of conduct. Please use the following form for any feedback, suggestions, issues or questions.

Getting Started

The project has a wiki which provides information to get help you started on how to contribute.

Licensing

The OWASP Machine Learning Security Project is licensed under the Creative Commons Attribution-ShareAlike 4.0 license so you can copy, distribute and transmit the work, and you can adapt it, and use it commercially, but all provided that you attribute the work and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

Purpose

The primary aim of of the OWASP Machine Learning Security Top 10 project is to deliver an overview of the top 10 security issues of machine learning systems. As such, a major goal of this project is to develop a high quality deliverable, reviewed by industry peers.

Target Audience

The primary audience for the deliverables in this project are developers, machine learning engineering and operational practitioners, and application security experts. While each of these roles build, operate and secure machine learning systems, the content is not aimed to be exclusively at them. The content will aim to specify where appropriate the level of understanding required for specific technology domains.

Scope

This project will provide an overview of the top 10 security issues of machine learning systems. Due to the rapid adoption of machine learning systems, there are related projects within OWASP and other organisations, that may have narrower or broader scope than this project. As an example, while adversarial attacks is a category of threats, this project will also cover non-adversarial scenarios, such as security hygiene of machine learning operational and engineering workflows.

Governance

The project will:

Adhere to the OWASP Project Policy

Project Leaders will:

Follow and adhere to all OWASP Foundation policies and procedures
Lead the project as per the Project Leader Handbook

Project Contributors will:

Follow and adhere to the code of conduct

Top 10 lists related to ML and AI:

Top10 lists similar to famous OWASP Top10 for Web Applications list, but for AI:

Vulnerability databases:

Catalogued vulnerabilities and risks that were present in real-world AI and ML systems:

AI/ML security guidelines:

Various guidelines on ML and AI Security and Safety

Playbooks

Interactive playbooks useful in threat modelling and securing AI.

Other

All the other resources related to ML Security - threat modelling resources, risk assessments framework, “Awesome Lists” etc.

0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

0

1

2

3

4

5

6

7

8

9 A

Adversarial attack
Type of attack which seeks to trick machine learning models into misclassifying inputs by maliciously tampering with input data

B

C

Classification
Process of arranging things in groups which are distinct from each other, and are separated by clearly determined lines of demarcation

D

Data labeling
Process of assigning tags or categories to each data point in a dataset

Data poisoning
Type of attack that inject poisoning samples into the data

Deep learning
Family of machine learning methods based on artificial neural networks with long chains of learnable causal links between actions and effects

E

Ensemble
See: Model Ensemble

F

G

H

I

Input Validation
Input validation is a technique for checking potentially dangerous inputs in order to ensure that the inputs are safe for processing within the code, or when communicating with other components

Intrusion Detection Systems (IDS)
Security service that monitors and analyzes network or system events for the purpose of finding, and providing real-time or near real-time warning of, attempts to access system resources in an unauthorized manner

Intrusion Prevention System (IPS)
System that can detect an intrusive activity and can also attempt to stop the activity, ideally before it reaches its targets

J

K

L

M

MLOps
The selection, application, interpretation, deployment, and maintenance of machine learning models within an AI-enabled system

Model
Detailed description or scaled representation of one component of a larger system that can be created, operated, and analyzed to predict actual operational characteristics of the final produced component

Model ensemble
Art of combining a diverse set of learners (individual models) together to improvise on the stability and predictive power of the model

N

O

Obfuscation
Defense mechanism in which details of the model or training data are kept secret by adding a large amount of valid but useless information to a data store

Overfitting
Overfitting is when a statistical model begins to describe the random error in the data rather than the relationships between variables. This occurs when the model is too complex

P

Perturbation
Noise added to an input sample

Q

R

Regularisation
Controlling model complexity by adding information in order to solve ill-posed problems or to prevent overfitting

S

Spam
The abuse of electronic messaging systems to indiscriminately send unsolicited bulk messages

T

U

Underfitting
Underfitting is when a data model is unable to capture the relationship between the input and output variables accurately, generating a high error rate on both the training set and unseen data

OWASP Machine Learning Security Top Ten

📌 Important Information

Overview

Top 10 Machine Learning Security Risks

Communication

Contribution

Getting Started

Licensing

Purpose

Target Audience

Scope

Governance

Related

0

1

2

3

4

5

6

7

8

9

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Machine Learning Security Top 10

Project Information

Classification

Audience

Leaders

Core Team

Project Team

Upcoming OWASP Global Events