Overview

Seedforge is a CLI and programmatic library that reads a database schema and fills it with realistic, foreign-key-correct seed data. Point it at a live PostgreSQL, MySQL, or SQLite database — or at a Prisma, Drizzle, TypeORM, or JPA schema file — and it produces INSERT statements where names go in name columns, emails in email columns, and prices in price columns. Install it with npm i -g @otg-dev/seedforge or import it as @otg-dev/seedforge in your tests.

The problem it solves

Hand-written fixtures rot the moment your schema changes. ORM seed scripts balloon into hundreds of lines the first time you add a many-to-many join. Random data breaks foreign key constraints, unique indexes, and CHECK constraints. And “production-like” volumes — the kind you actually need for PR preview environments, load tests, and customer demos — are tedious to hand-craft.

Seedforge replaces all of that with a single command that understands your schema and does the right thing.

How it works

Introspect — connects to your database (reading pg_catalog, information_schema, or sqlite_master) or parses your ORM schema files.
Resolve the FK graph — builds a directed graph of foreign keys, detects cycles, and produces a topological insertion order with deferred UPDATEs for the cycles.
Map columns — matches 190+ column-name patterns across 10 domains (person, contact, location, finance, temporal, etc.) to the right Faker.js generator.
Generate — produces rows table-by-table in dependency order, threading FK references, enforcing uniqueness, and respecting NOT NULL, UNIQUE, CHECK, enums, and generated columns.
Insert — writes directly to the database, exports a .sql file, or prints a dry-run summary. PostgreSQL users get a --fast path using COPY for 10x+ throughput.

When to reach for Seedforge

Spinning up a local dev database with data that looks like your real app
Populating PR preview environments so reviewers can click through real-looking UI
Seeding test databases for E2E and load tests with deterministic --seed N output
Building demo and sales environments that need thousands of believable rows
Dry-running a migration against a freshly-populated copy of a schema

When NOT to use it

It does not create your tables. Run your migrations first — Seedforge reads the schema you already have.
It is not an anonymization tool. If you need to scrub real PII out of a production dump, reach for a purpose-built anonymizer. Seedforge generates fresh data from scratch.
It does not import arbitrary datasets. If you have a specific CSV or JSON dump you need loaded, use \copy or your ORM’s bulk-insert API.

What makes it different

CI-native. Deterministic output via --seed 42 means the same command produces the same rows across every CI run.
Library API with transaction rollback. withSeed() gives you a test helper that wraps each run in a transaction and rolls it back on teardown — no state leaks between tests.
Four ORM parsers plus live introspection. Prisma, Drizzle, TypeORM, and JPA schemas work with or without a running database.
190+ column patterns. Multi-tier matching (exact → suffix → prefix → regex → semantic stem) means billing_email, user_email_address, and contact_email all get real email addresses.
PostgreSQL fast path. --fast uses COPY for large-dataset workloads where plain INSERT is too slow.