Tech

What is CSV?

Published
Published:
Table of Contents

CSV stands for “Comma Separated Values”. It is the oldest, simplest way to store a table of data, and despite being very plain, it refuses to die. Every spreadsheet tool, database, and data science library can read and write it.

How it looks

The first line is usually the column names. Every line after that is one row, with values separated by commas:

name,city,role
Shravan,Cambridge,Research Assistant
Asha,Delhi,Engineer
Ravi,Bengaluru,Designer
No rulebook, on purpose

That is the whole format. There is no single official standard — which is both its greatest strength (everything supports it) and its biggest headache (everyone supports it slightly differently).

The small problems

CSV looks easy until your data contains a comma. Then things get interesting:

When values contain commas or quotes
  • If a value has a comma inside it, wrap it in double quotes: "Goswami, Shravan".
  • If a value has a double quote inside it, you double the quote: "He said ""hi""".
  • Some countries use a semicolon instead of a comma, because their numbers already use commas. Fun.

So here is a quick gut-check. How many columns does this single row have?

"Goswami, Shravan",Cambridge,Research Assistant

Three. The comma inside the quotes is part of the name, not a separator — that is exactly why quoting matters.

Reading it in code

You almost never parse CSV by hand — let a library handle the quoting rules for you:

Flat vs nested

CSV is a flat grid — rows and columns, nothing more. The moment your data grows branches, it is the wrong tool:

CSV is perfect for this

A simple table: people, with a city and a role each. Rows and columns, done.

CSV struggles with this

A person who has many addresses, each with many phone numbers. Nesting like that wants JSON.

Why it is still everywhere

  • Any tool can open it, even Notepad.
  • The file size is small.
  • It is plain text, so it works nicely with version control and scripts.

When to use it

Use CSV for simple tables — rows and columns of data you want to move between Excel, a database, or a Python script. The moment your data becomes nested (lists inside lists), switch to JSON instead.

Support this post Sponsor