OWASP Machine Learning Security Top Ten

OWASP Incubator License: CC BY-SA 4.0

đź“Ś Important Information

The current version of this work is in draft and is being modified frequently. Please refer to the project wiki for information on how to contribute and project release timelines.

Overview

Welcome to the repository for the OWASP Machine Learning Security Top 10 project! The primary aim of the OWASP Machine Learning Security Top 10 project is to deliver an overview of the top 10 security issues of machine learning systems. More information on the project scope and target audience is available in our project working group charter

Top 10 Machine Learning Security Risks

Communication

Contribution

The initial version of the Machine Learning Security Top 10 list was contributed by Sagar Bhure and Shain Singh. The project encourages community contribution and aims to produce a high quality deliverable reviewed by industry peers.

All contributors will need to adhere to the project’s code of conduct. Please use the following form for any feedback, suggestions, issues or questions.

Getting Started

The project has a wiki which provides information to get help you started on how to contribute.

Licensing

The OWASP Machine Learning Security Project is licensed under the Creative Commons Attribution-ShareAlike 4.0 license so you can copy, distribute and transmit the work, and you can adapt it, and use it commercially, but all provided that you attribute the work and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.


Purpose

The primary aim of of the OWASP Machine Learning Security Top 10 project is to deliver an overview of the top 10 security issues of machine learning systems. As such, a major goal of this project is to develop a high quality deliverable, reviewed by industry peers.

Target Audience

The primary audience for the deliverables in this project are developers, machine learning engineering and operational practitioners, and application security experts. While each of these roles build, operate and secure machine learning systems, the content is not aimed to be exclusively at them. The content will aim to specify where appropriate the level of understanding required for specific technology domains.

Scope

This project will provide an overview of the top 10 security issues of machine learning systems. Due to the rapid adoption of machine learning systems, there are related projects within OWASP and other organisations, that may have narrower or broader scope than this project. As an example, while adversarial attacks is a category of threats, this project will also cover non-adversarial scenarios, such as security hygiene of machine learning operational and engineering workflows.

Governance

The project will:

Project Leaders will:

Project Contributors will:


0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z


0

1

2

3

4

5

6

7

8

9

A

Adversarial attack
Type of attack which seeks to trick machine learning models into misclassifying inputs by maliciously tampering with input data

B

C

Classification
Process of arranging things in groups which are distinct from each other, and are separated by clearly determined lines of demarcation

D

Data labeling
Process of assigning tags or categories to each data point in a dataset

Data poisoning
Type of attack that inject poisoning samples into the data

Deep learning
Family of machine learning methods based on artificial neural networks with long chains of learnable causal links between actions and effects

E

Ensemble
See: Model Ensemble

F

G

H

I

Input Validation
Input validation is a technique for checking potentially dangerous inputs in order to ensure that the inputs are safe for processing within the code, or when communicating with other components

Intrusion Detection Systems (IDS)
Security service that monitors and analyzes network or system events for the purpose of finding, and providing real-time or near real-time warning of, attempts to access system resources in an unauthorized manner

Intrusion Prevention System (IPS)
System that can detect an intrusive activity and can also attempt to stop the activity, ideally before it reaches its targets

J

K

L

M

MLOps
The selection, application, interpretation, deployment, and maintenance of machine learning models within an AI-enabled system

Model
Detailed description or scaled representation of one component of a larger system that can be created, operated, and analyzed to predict actual operational characteristics of the final produced component

Model ensemble
Art of combining a diverse set of learners (individual models) together to improvise on the stability and predictive power of the model

N

O

Obfuscation
Defense mechanism in which details of the model or training data are kept secret by adding a large amount of valid but useless information to a data store

Overfitting
Overfitting is when a statistical model begins to describe the random error in the data rather than the relationships between variables. This occurs when the model is too complex

P

Perturbation
Noise added to an input sample

Q

R

Regularisation
Controlling model complexity by adding information in order to solve ill-posed problems or to prevent overfitting

S

Spam
The abuse of electronic messaging systems to indiscriminately send unsolicited bulk messages

T

U

Underfitting
Underfitting is when a data model is unable to capture the relationship between the input and output variables accurately, generating a high error rate on both the training set and unseen data

V

W

X

Y

Z