JSON vs YAML: Which Should You Use and When?

The same data, two formats

Here is the same Docker Compose service definition in both formats:

# YAML (docker-compose.yml)
services:
  web:
    image: nginx:latest
    ports:
      - "80:80"
    environment:
      - NODE_ENV=production
    depends_on:
      - db
  db:
    image: postgres:16
    volumes:
      - db_data:/var/lib/postgresql/data

volumes:
  db_data:

// JSON equivalent
{
  "services": {
    "web": {
      "image": "nginx:latest",
      "ports": ["80:80"],
      "environment": ["NODE_ENV=production"],
      "depends_on": ["db"]
    },
    "db": {
      "image": "postgres:16",
      "volumes": ["db_data:/var/lib/postgresql/data"]
    }
  },
  "volumes": {
    "db_data": null
  }
}

The YAML version is noticeably shorter and easier to read at a glance. The JSON version is more explicit, has no ambiguity, and is directly parseable by any HTTP client.

JSON: strengths and weaknesses

Strengths:

Universal — every programming language has a JSON parser built in or available as a standard library.
Unambiguous — no implicit type coercion, no indentation sensitivity.
The native data format for REST APIs and HTTP payloads.
Easy to validate with JSON Schema.
Tool support is excellent (formatters, linters, diff tools).

Weaknesses:

Verbose — all strings need quotes, all keys need quotes, commas everywhere.
No comments — you cannot add inline explanations to a JSON file.
Trailing commas are invalid — a common source of syntax errors when editing by hand.
No multi-line string syntax — embedding long strings or code snippets is awkward.

YAML: strengths and weaknesses

Strengths:

Human-readable — the config files for Docker Compose, Kubernetes, GitHub Actions, Ansible, and most CI/CD systems are YAML for good reason.
Supports comments with #.
Multi-line strings with | (literal) and > (folded) block scalars.
Anchors and aliases for DRY config files (&anchor / *alias).

Weaknesses:

Indentation-sensitive — a misplaced space changes the document structure silently.
Implicit typing is a trap (see the Norway Problem below).
Multiple ways to write the same thing leads to inconsistency.
Harder to validate programmatically than JSON.

The Norway Problem

YAML's implicit type coercion has a famous consequence: country codes. In YAML 1.1 (used by many parsers), bare unquoted values that look like booleans are converted automatically:

# YAML 1.1 implicit type coercion
countries:
  - NO   # parsed as boolean false (!!)
  - YES  # parsed as boolean true
  - ON   # parsed as boolean true
  - OFF  # parsed as boolean false

# Workaround: always quote strings that could be misinterpreted
countries:
  - "NO"
  - "YES"
  - "ON"
  - "OFF"

The Norway Problem has caused real-world bugs in Kubernetes manifests, Ansible playbooks, and CI/CD configs. YAML 1.2 fixes most of these, but many parsers (including PyYAML's default mode) still use 1.1 semantics. When in doubt, quote your strings.

When to use JSON

API request/response bodies
Data stored in databases (PostgreSQL JSONB, MongoDB documents)
Package manifests (package.json, tsconfig.json)
Any format that will be consumed programmatically, not read by humans
When you need strict schema validation

When to use YAML

CI/CD pipeline definitions (GitHub Actions, GitLab CI, CircleCI)
Container orchestration (Kubernetes manifests, Docker Compose)
Infrastructure-as-code (Ansible playbooks, Helm charts)
Application config files that developers edit by hand
Anything where comments and readability matter more than strict unambiguity

The short rule: use JSON for machines, use YAML for humans. When you need to convert between the two — say, a CI pipeline generates JSON that needs to be stored as a YAML config — a converter handles it in seconds.

mini-tools.dev