DB-First Architecture: Why PostgreSQL Beats YAML for Developer Tools

How we migrated from 200+ YAML files to PostgreSQL — and why every developer tool should consider doing the same.

Posted Feb 2, 2026

By Kishore Rajendra

5 min read

🎯 TL;DR

Replaced 200+ YAML config files with 40+ PostgreSQL tables — query times dropped 98%
Idempotent upsert migrations (ON CONFLICT DO UPDATE) made every migration safely re-runnable
JSONB columns handle schema evolution without breaking changes
YAML files kept as read-only exports — developers can still browse them, but DB is the source of truth

😤 The Problem: YAML at Scale

If you’re building a developer tool that manages projects, sprints, tasks, or configuration, you’ve probably started with YAML or JSON files. We did too. It works great at first — human-readable, version-controlled, easy to edit.

Then your tool grows. Ours looked like this:

sprints/
  project-a/
    sprint-01.yaml
    sprint-02.yaml
    ...
    sprint-55.yaml
  project-b/
    sprint-01.yaml
    ...
config/
  learnings/
    2026-01-15-cache-pattern.yaml
    2026-01-16-build-fix.yaml
    ...213 files...
  events.yaml          # 4000+ lines, growing daily

The breaking point? Two automated processes tried to update the same YAML file simultaneously. One process’s changes silently overwrote the other’s. We lost data that had to be manually re-entered.

The lesson: The moment you have more than one writer — whether that’s a developer, a CI pipeline, or an AI agent — file-based storage becomes a race condition waiting to happen.

🐘 Why PostgreSQL, Not Just “Any Database”

We chose PostgreSQL specifically for three features that YAML files can’t replicate:

ACID transactions — concurrent writes are safe by default (MVCC handles isolation)
JSONB columns — schema flexibility without losing query power. New fields go in JSONB first, graduate to real columns when stable
pgvector extension — semantic search over knowledge entries using embedding vectors, something no file-based approach can match

🔄 The Migration Strategy: Idempotent Upserts

The key insight that made migration painless: every SQL statement must be safely re-runnable.

  
-- Every table uses IF NOT EXISTS
CREATE TABLE IF NOT EXISTS sprints (
    id SERIAL PRIMARY KEY,
    project_id INTEGER REFERENCES projects(id),
    name VARCHAR(200) NOT NULL,
    status VARCHAR(50) DEFAULT 'active',
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(project_id, name)
);

-- Every INSERT uses ON CONFLICT
INSERT INTO sprints (project_id, name, status)
VALUES (1, 'sprint-alpha', 'active')
ON CONFLICT (project_id, name) DO UPDATE
SET metadata = sprints.metadata || EXCLUDED.metadata,
    updated_at = NOW();

This means you can run any migration file twice (or ten times) and get the same result. No “migration already applied” errors. No manual rollbacks. This single pattern eliminated an entire class of deployment bugs.

🧬 Schema Evolution: JSONB as an Escape Hatch

Traditional schema changes in developer tools are painful — you add a column, every config file needs updating, backward compatibility breaks. With JSONB:

  
-- Need a new field? Just add it to metadata
UPDATE sprints
SET metadata = metadata || '{"priority_score": 85}'
WHERE id = 42;

-- Query it immediately
SELECT name, metadata->>'priority_score' as priority
FROM sprints
WHERE (metadata->>'priority_score')::int > 80;

Our rule: new fields start in metadata JSONB. If they’re used consistently across 3+ release cycles, promote to a real column:

  
ALTER TABLE sprints ADD COLUMN IF NOT EXISTS priority_score INTEGER;
UPDATE sprints SET priority_score = (metadata->>'priority_score')::int
WHERE metadata ? 'priority_score';

This gives you the flexibility of YAML (add any key anytime) with the queryability of SQL.

⚡ Performance: The Numbers

Operation	YAML (file-based)	PostgreSQL	Improvement
List all items	2.3s (glob + parse 55 files)	45ms (indexed query)	51x faster
Full-text search	1.8s (grep across 213 files)	62ms (tsvector index)	29x faster
Update one record	340ms (read + parse + write + fsync)	38ms (single UPDATE)	9x faster
Concurrent writes	Race condition / data loss	ACID guaranteed	Safe
Cross-project query	4.1s (scan all directories)	78ms (JOIN)	53x faster

The 51x improvement on listing isn’t just about speed — it’s about what becomes possible. Sub-50ms queries mean you can build real-time dashboards, autocomplete, and interactive CLIs that feel instant.

📁 Keeping YAML Around (Read-Only Exports)

The biggest resistance to DB-first comes from developers who say: “But I can read YAML in my editor!” Fair point. The solution: keep YAML as read-only exports.

  
# Generate YAML snapshots from DB (for browsing, not editing)
dp export --format yaml --output sprints/

# The files are gitignored or auto-regenerated
# Manual edits get overwritten on next export

Developers still get human-readable files. But the source of truth is always the database. Any manual YAML edit gets overwritten on the next export cycle.

📏 Five Rules We Live By

After the migration, we codified these rules:

DB is the only source of truth — if it’s not in PostgreSQL, it doesn’t exist
Every INSERT has ON CONFLICT — no migration should fail on a second run
YAML is export-only — generated from DB, never edited directly
New fields start in JSONB — graduate to columns when stable
IF NOT EXISTS on everything — CREATE TABLE, ADD COLUMN, CREATE INDEX

📊 Results

After migrating:

Query times dropped 98% across the board
Zero data loss incidents (vs 3 in the YAML era from concurrent writes)
Semantic search over knowledge entries via pgvector embeddings
Real-time dashboards became possible (sub-50ms queries)
Schema changes became non-breaking (JSONB escape hatch)

🤔 When to Make the Switch

You should consider DB-first when:

You have more than one process writing to config/state files
You need to query across projects or aggregate data
Your YAML/JSON files exceed 50 files or 1000 lines
You want full-text search or semantic search over your data
You’re building a CLI or dashboard that needs fast lookups

You can stay with YAML when:

Single-user, single-project tool
Static configuration that rarely changes
No need for cross-file queries
Version control diff visibility is the top priority

The migration is worth it earlier than you think. The concurrent write bug that finally pushed us over was predictable in hindsight — we just didn’t hit it until our tool scaled.

Written by Kishore Rajendra, AI-assisted. DevPilot is a development velocity engine for solo developers and small teams.

Architecture

This post is licensed under CC BY 4.0 by the author.