OWASP DeepSecrets

Introduction

OWASP DeepSecrets - a better tool for secret scanning

Yet another tool - why?

Existing tools don’t really “understand” code. Instead, they mostly parse texts.

DeepSecrets expands classic regex-search approaches with semantic analysis, dangerous variable detection, and more efficient usage of entropy analysis. Code understanding supports 500+ languages and formats and is achieved by lexing and parsing - techniques commonly used in SAST tools.

DeepSecrets also introduces a new way to find secrets: just use hashed values of your known secrets and get them found plain in your code.

How it Works

Under the hood story is in articles here: https://hackernoon.com/modernizing-secrets-scanning-part-1-the-problem

FAQ

Mini-FAQ

Pff, is it still regex-based?

Yes and no. Of course, it uses regexes and finds typed secrets like any other tool. But language understanding (the lexing stage) and variable detection also use regexes under the hood. So regexes is an instrument, not a problem.

Why don’t you build true abstract syntax trees? It’s academically more correct!

DeepSecrets tries to keep a balance between complexity and effectiveness. Building a true AST is a pretty complex thing and simply an overkill for our specific task. So the tool still follows the generic SAST-way of code analysis but optimizes the AST part using a different approach.

I’d like to build my own semantic rules. How do I do that?

Only through the code by the moment. Formalizing the rules and moving them into a flexible and user-controlled ruleset is in the plans.

But what about Semgrep Secrets? Looks like you’re cloning their thing.

DeepSecrets was released in April 2023 — half a year before the Semgrep Secrets release and I’m very glad to be followed. We share the same ideas and principles under the hood but:

  • DeepSecrets is free, Semgrep is a commercial product
  • Code analysis in DeepSecrets is wider and not limited to a specific set of languages like in Semgrep

I still have a question

Feel free to communicate with the maintainer


Getting Started

Installation

From Github via pip

$ pip install git+https://github.com/ntoskernel/deepsecrets.git

From PyPi

$ pip install deepsecrets

Scanning

The easiest way:

$ deepsecrets --target-dir /path/to/your/code --outformat dojo-sarif --outfile report.json

This will run a scan against /path/to/your/code using the default configuration:

  • Regex checks by a small built-in ruleset
  • Semantic checks (variable detection, entropy checks)

Report in SARIF format (DefectDojo-compatible) will be saved to report.json. If you face any problem with SARIF format, you can fall back to internal format via --outfile json

Masking secrets inside a report

As of version 1.3.0 all potential secrets inside reports are masked by default, but you can turn this feature off via the --disable-masking flag.

[!Caution]
If you decide to integreate DeepSecrets to your CI pipeline with masking disabled, you will likely re-leak your secrets inside your CI artefacts.

Fine-tuning

Run deepsecrets --help for details.

Basically, you can (and should) use your own regex-ruleset by specifying --regex-rules. Building rulesets is described in the next section.

Paths to be excluded from scanning can be set via --excluded-paths. The default set of excluded paths is here: /deepsecrets/rules/excluded_paths.json, you can write your own following the format.

Building rulesets

Regex

The built-in ruleset for regex checks is located in /deepsecrets/rules/regexes.json. You’re free to follow the format and create a custom ruleset.

HashedSecret

Example ruleset for hashed checks is located in /tests/fixtures/hashed_secrets.json. You’re free to follow the format and create a custom ruleset.