Tech

What is YAML?

Published
Published:
Table of Contents

YAML is a way to write data that humans can actually read without crying. You will see it everywhere — GitHub Actions, Docker Compose, Kubernetes, and most website configs. The name stands for “YAML Ain’t Markup Language”1, which is a bit of a nerd joke, but let us not worry about that.

The one-sentence version

YAML is JSON for humans. In fact, every valid JSON file is already valid YAML — so YAML is basically JSON after a relaxing holiday.

Why people like it

JSON is good for computers, but for humans it is full of brackets, quotes and commas. YAML keeps the same idea but uses simple indentation instead. So the file reads like a neat list you would write in a notebook. The difference is easiest to feel side by side:

JSON — for machines
{
  "name": "Shravan",
  "languages": ["C++", "Python", "Julia"]
}
YAML — for humans
name: Shravan
languages:
  - C++
  - Python
  - Julia

The basic pieces

There are really only three shapes to learn. Flip through them:

Two handy features

Comments — JSON does not have these, which is a big reason everyone moved their config files to YAML:

# this is a comment
port: 8080

And long text written across many lines:

description: |
  This keeps
  each line separate.

The famous traps

YAML is friendly until it quietly does something you did not ask for. These three catch almost everyone once.

Spaces, never tabs

Indentation is the structure in YAML — the spec is strict about it [1]YAML Language Development Team · YAML Ain't Markup Language (YAML) version 1.2.2 · The current specification. · 2021. One stray tab character and the whole file refuses to load, usually with an error that points nowhere near the real problem.

The sneakiest one has a name. Try to guess the result before you peek:

It becomes the boolean false, not the word “NO” — because YAML reads NO as “no”. This is the famous “Norway problem” [2]StrictYAML (hitchdev) · Implicit typing removed — the "Norway problem" · 2019. The fix is to quote it: country: "NO".

The version-number trap

Write version: 1.10 and YAML turns it into the number 1.1. Always quote versions: version: "1.10".

A little history

YAML has been around longer than most people think:

2001
First proposed

Clark Evans introduces the idea of a human-readable data language.

2004
YAML 1.0

The first official specification lands.

2009
YAML 1.2

Realigned with JSON, making JSON a strict subset of YAML.

2021
YAML 1.2.2

A cleanup release — still the current spec today.

Where I actually meet YAML

Roughly where my YAML-editing time goes these days (very unscientific, just vibes):

CI / GitHub Actions45%
Docker Compose30%
Static-site & app config25%

When to use it

A human-friendly data serialization language for all programming languages.

yaml.orgthe official tagline

Use YAML for config files that you edit by hand. When two programs are talking to each other over the internet, JSON is usually the better choice.

References & further reading

  1. YAML Language Development Team. YAML Ain't Markup Language (YAML) version 1.2.2. 2021. The current specification..
  2. StrictYAML (hitchdev). Implicit typing removed — the "Norway problem". 2019.

Footnotes

  1. It’s a recursive acronym — the name literally contains itself. Programmers find this funnier than they probably should.

Support this post Sponsor