developmentdatapractical

CSV Files Explained: Delimiters, Quoting Rules, and Encoding Pitfalls

CSV is the most universal data exchange format — and the most inconsistently implemented. Here's the complete guide to how CSV actually works, why it breaks, and how to handle edge cases correctly.

·6 min read

What CSV Is (and Isn't)

CSV (Comma-Separated Values) is a plain-text format for tabular data. Each line is a row; columns are separated by commas. Simple concept, endless variation in practice.

There is an RFC for CSV (RFC 4180, published 2005), but it's not mandatory and many programs produce CSV that differs from it. Excel, Google Sheets, and various databases all have slightly different interpretations of the format.

The Basic Rules

  • One record per line
  • Fields separated by commas
  • First row is usually (but not always) a header row
  • No data type information — everything is a string at the format level

name,age,city Alice,30,London Bob,25,New York

Quoting: When Fields Must Be Wrapped in Quotes

  • The delimiter character (, in standard CSV)
  • A line break (CR, LF, or CRLF)
  • A double-quote character

name,bio Alice,"Developer, writer, speaker" Bob,"Line one Line two"

The comma inside "Developer, writer, speaker" must be inside quotes, otherwise parsers treat it as a column separator.

The Double-Quote Escaping Rule

To include a literal double-quote inside a quoted field, double it:

quote,"He said ""hello"" to me"

This is the RFC 4180 rule. Some systems use backslash escaping instead (\"`), which creates incompatibilities between parsers.

The Delimiter Isn't Always a Comma

  • **CSV** — comma (,)
  • **TSV** — tab ( ) — common for data with commas in values
  • **SSV** — semicolon (;) — default in Excel on systems where comma is the decimal separator (Germany, France, much of Europe)
  • **Pipe-delimited** (|) — used when values commonly contain both commas and tabs

Excel in a German locale exports CSV with semicolons, which breaks every CSV importer expecting commas. This is one of the most common CSV compatibility issues.

Line Endings

RFC 4180 specifies CRLF ( ) as the line ending. In practice, LF ( ) is used by Linux/macOS systems. Windows programs often produce CRLF. Well-written parsers handle both.

A common bug: a file with CRLF line endings opened on a Linux system leaves a at the end of every value in the last column. The value "London" becomes "London ", which breaks equality checks and string operations silently.

Encoding Pitfalls

CSV files have no encoding declaration. The reader has to guess, which fails for non-ASCII characters.

**UTF-8 BOM** — some programs (notably Excel) write a byte order mark () at the start of UTF-8 CSV files. This helps Excel detect the encoding but can break other parsers that treat the BOM as part of the first field name.

**Latin-1 vs UTF-8** — Excel historically defaults to saving in the system's local encoding (Windows-1252 on English Windows), not UTF-8. Accented characters (é, ü, ñ) written in Windows-1252 and read as UTF-8 produce garbled output. Always specify encoding explicitly when opening CSV files programmatically.

NULL and Empty Values

CSV has no null value. An empty field — two consecutive delimiters — could mean null, an empty string, or zero, depending on the application.

Alice,,London

The middle field is empty. Whether this is null or "" depends on how the importer handles it. Conventions vary by application.

Practical Parsing Rules

  • Never parse CSV manually with split(",")) — it will fail on quoted fields containing commas
  • Use a CSV parsing library in your language (Python's csv module, JavaScript's Papa Parse, Go's encoding/csv)
  • Always specify the delimiter explicitly rather than assuming comma
  • Always specify encoding explicitly (prefer UTF-8)
  • Handle the header row separately — don't assume it's always present or always in the first row
  • Test with values containing commas, quotes, and line breaks

NoxaKit's CSV Delimiter Converter, CSV Column Trimmer, and CSV Duplicate Row Remover handle common CSV transformation tasks in the browser — no upload, no server processing.

Try These Free Tools

More Articles