JSON vs YAML: Which Should You Use and When?
JSON and YAML are both ways of serializing structured data. They can represent the same information, and YAML is technically a superset of JSON. But they have very different strengths, and the choice between them matters for readability, tooling, and correctness.
The same data, two formats
Here is the same Docker Compose service definition in both formats:
# YAML (docker-compose.yml)
services:
web:
image: nginx:latest
ports:
- "80:80"
environment:
- NODE_ENV=production
depends_on:
- db
db:
image: postgres:16
volumes:
- db_data:/var/lib/postgresql/data
volumes:
db_data:// JSON equivalent
{
"services": {
"web": {
"image": "nginx:latest",
"ports": ["80:80"],
"environment": ["NODE_ENV=production"],
"depends_on": ["db"]
},
"db": {
"image": "postgres:16",
"volumes": ["db_data:/var/lib/postgresql/data"]
}
},
"volumes": {
"db_data": null
}
}The YAML version is noticeably shorter and easier to read at a glance. The JSON version is more explicit, has no ambiguity, and is directly parseable by any HTTP client.
JSON: strengths and weaknesses
Strengths:
- Universal — every programming language has a JSON parser built in or available as a standard library.
- Unambiguous — no implicit type coercion, no indentation sensitivity.
- The native data format for REST APIs and HTTP payloads.
- Easy to validate with JSON Schema.
- Tool support is excellent (formatters, linters, diff tools).
Weaknesses:
- Verbose — all strings need quotes, all keys need quotes, commas everywhere.
- No comments — you cannot add inline explanations to a JSON file.
- Trailing commas are invalid — a common source of syntax errors when editing by hand.
- No multi-line string syntax — embedding long strings or code snippets is awkward.
YAML: strengths and weaknesses
Strengths:
- Human-readable — the config files for Docker Compose, Kubernetes, GitHub Actions, Ansible, and most CI/CD systems are YAML for good reason.
- Supports comments with
#. - Multi-line strings with
|(literal) and>(folded) block scalars. - Anchors and aliases for DRY config files (
&anchor/*alias).
Weaknesses:
- Indentation-sensitive — a misplaced space changes the document structure silently.
- Implicit typing is a trap (see the Norway Problem below).
- Multiple ways to write the same thing leads to inconsistency.
- Harder to validate programmatically than JSON.
The Norway Problem
YAML's implicit type coercion has a famous consequence: country codes. In YAML 1.1 (used by many parsers), bare unquoted values that look like booleans are converted automatically:
# YAML 1.1 implicit type coercion
countries:
- NO # parsed as boolean false (!!)
- YES # parsed as boolean true
- ON # parsed as boolean true
- OFF # parsed as boolean false
# Workaround: always quote strings that could be misinterpreted
countries:
- "NO"
- "YES"
- "ON"
- "OFF"The Norway Problem has caused real-world bugs in Kubernetes manifests, Ansible playbooks, and CI/CD configs. YAML 1.2 fixes most of these, but many parsers (including PyYAML's default mode) still use 1.1 semantics. When in doubt, quote your strings.
When to use JSON
- API request/response bodies
- Data stored in databases (PostgreSQL JSONB, MongoDB documents)
- Package manifests (
package.json,tsconfig.json) - Any format that will be consumed programmatically, not read by humans
- When you need strict schema validation
When to use YAML
- CI/CD pipeline definitions (GitHub Actions, GitLab CI, CircleCI)
- Container orchestration (Kubernetes manifests, Docker Compose)
- Infrastructure-as-code (Ansible playbooks, Helm charts)
- Application config files that developers edit by hand
- Anything where comments and readability matter more than strict unambiguity
The short rule: use JSON for machines, use YAML for humans. When you need to convert between the two — say, a CI pipeline generates JSON that needs to be stored as a YAML config — a converter handles it in seconds.