WSTG - Latest
Testing for CSV Injection
| ID |
|---|
| WSTG-INPV-21 |
Summary
CSV Injection (also known as Formula Injection) occurs when an application embeds untrusted, user-controlled input into CSV (or other spreadsheet-compatible) exports and the resulting file is opened in a spreadsheet program (e.g., Microsoft Excel, LibreOffice Calc). Spreadsheet applications may interpret certain cell values as formulas, which can lead to security issues such as user deception (phishing-style workflows), manipulation of spreadsheet output, or data exfiltration. In some environments, formula injection can be escalated to higher impact via spreadsheet “gadgets” and legacy features (e.g., DDE / Dynamic Data Exchange behaviors), potentially reaching command execution on the workstation that opens the file—typically dependent on client configuration and/or user interaction.
A key characteristic of this issue is that the vulnerability often manifests only when the exported file is opened by a user (e.g., an administrator, finance, or support) in a spreadsheet application.
Test Objectives
- Identify CSV/spreadsheet export features that include untrusted input.
- Verify whether attacker-controlled values are interpreted as formulas when the export is opened in common spreadsheet applications.
- Check whether separator/quote injection can move a dangerous prefix to the start of a cell.
- Validate whether mitigations remain effective in Microsoft Excel after saving and re-opening the CSV.
- Assess practical impact based on who opens the export and how it is used.
How to Test
Formula-Triggering Prefixes
Cells beginning with the following characters may be interpreted as formulas by spreadsheet software:
- Equals (
=) - Plus (
+) - Minus (
-) - At (
@) - Tab (
0x09) - Carriage return (
0x0D) - Line feed (
0x0A) - Full-width (double-byte) variants such as
=,+,-,@(depending on locale/application behavior)
Important (Excel behavior): Microsoft Excel may remove quotes or escape characters from CSV cells when a file is saved and re-opened. As a result, some commonly suggested mitigations can fail after save/reopen and previously escaped formulas may become active again.
Also note that it is not sufficient to ensure the overall untrusted input does not start with a dangerous character. Attackers may inject separators and quoting to start a new cell, placing the dangerous character at the beginning of a cell.
Identify CSV Export Functionality and Data Sources
Locate features that generate CSV/TSV or “export to spreadsheet” content:
- Reports (users, transactions, audit logs, tickets)
- Admin dashboards exporting lists
- Email attachments generated by the application
- Scheduled exports / integrations
Identify untrusted data sources that can end up in the export:
- User profiles (name, email, company)
- Free-text fields (comments, ticket subjects, notes)
- Imported/integrated external data (webhooks, CRM sync, partner feeds)
Document which roles can trigger the export and which roles are likely to open it.
Place Benign, Detectable Formula-Like Values into Candidate Fields
Use harmless payloads to detect formula evaluation (avoid payloads that execute commands or perform uncontrolled network access). Test values that begin with each formula-triggering character:
=1+1+1+1-1+1@SUM(1,1)=HYPERLINK("http://example.invalid/leak?test=1", "Click Me")
Notes:
- The
HYPERLINK()case is useful to demonstrate realistic impact (deception/phishing-style flows, or potential metadata exposure when a link is clicked/opened). Use a controlled endpoint during testing (e.g., a local listener or an internal test host). - Do not use external “real attacker” infrastructure in validation.
Also test control-character and Unicode variants (where input handling allows it):
- A value that begins with a tab character followed by
=1+1(TAB +=1+1) - Full-width prefix variants (e.g.,
=1+1)
Test Separator and Quote “Cell Breakout” Scenarios
Because CSV is cell-based, test whether you can inject content that starts a new cell and then begins with a dangerous character. This depends on:
- Field separator (commonly
,or;) - Quoting rules and escaping
- Application-side CSV generation and encoding
Example benign test patterns (adjust separator to the actual export format):
- A value containing a quote and separator intended to create a new cell, then
=1+1 - A value containing a separator directly (if not quoted by the exporter), then
=1+1
Your objective is to see whether the resulting CSV contains any cell whose first character is one of the formula-triggering prefixes. Verify this by inspecting the raw CSV output in a text editor.
Export and Verify in Spreadsheet Applications
- Export/download the CSV.
- Open it in at least one spreadsheet application used in the target environment (Excel and/or LibreOffice, etc.).
- Confirm whether the cell is interpreted as a formula:
- The spreadsheet displays the computed result (e.g.,
2) instead of the literal string (e.g.,=1+1), and/or - The formula bar shows a formula rather than plain text.
- The spreadsheet displays the computed result (e.g.,
Record:
- Application name/version (e.g., Microsoft Excel)
- How the file was opened (double-click vs import wizard)
- Whether locale settings influenced parsing
Excel Save/Re-open Regression Test (Mitigation Reliability)
If the application claims to escape/quote values:
- Open the exported CSV in Microsoft Excel.
- Save the file (e.g., as CSV).
- Close and re-open it.
- Re-check whether previously “neutralized” values became active formulas.
This step is critical because Excel may normalize/remove certain escaping/quoting behaviors after save/reopen.
Assess Impact
Assess practical risk based on:
- Who opens the file (privileged roles vs end users)
- Whether exports are shared externally
- Whether spreadsheet values are relied on for decisions (e.g., totals, flags, statuses)
- Whether formulas could mislead users or trigger spreadsheet behaviors in your environment
Provide a clear reproduction path:
- Where input is entered
- How export is generated
- Which program interprets it as a formula
- Evidence (screenshot or description) that evaluation occurs
Higher-Impact Validation (Optional, Strictly Controlled)
Some real-world cases show formula injection escalating beyond simple calculation/deception (e.g., via legacy spreadsheet features such as DDE). This behavior is highly environment-dependent.
Remediation
There is no universal CSV sanitization strategy safe for all spreadsheet applications and downstream consumers. For CSVs intended for human viewing in spreadsheet software, apply a defense-in-depth approach:
- Ensure no cell begins with formula-triggering characters (
=,+,-,@) and relevant control/Unicode variants. - Consider per-field sanitization (commonly suggested):
- Wrap each cell in double quotes
- Prepend each cell with a single quote
- Escape double quotes by doubling them
Note: This may not remain reliable in Microsoft Excel after saving and re-opening.
- Excel-resistant mitigation (observed behavior):
- If a cell starts with
=,+,-, or@, prefix the content with a tab character (0x09) inside a quoted field (e.g.,"\t=1+1").
Trade-off: The tab becomes part of the underlying data and may affect downstream programmatic imports.
- If a cell starts with
Always validate the chosen mitigation against the spreadsheet applications and workflows in scope.
Tools
- Intercepting proxy (e.g., Burp Suite, ZAP) to inject controlled values and reproduce the exact export workflow.
- Spreadsheet software for validation in an isolated environment (e.g., Microsoft Excel, LibreOffice Calc).
- A safe HTTP listener (local or internal test host) to observe clicks/requests when validating
HYPERLINK()behavior. - A text editor to inspect the raw CSV and confirm whether any cell begins with a formula-triggering prefix.