Overview
Seedforge is a CLI and programmatic library that reads a database schema and fills it with realistic, foreign-key-correct seed data. Point it at a live PostgreSQL, MySQL, or SQLite database — or at a Prisma, Drizzle, TypeORM, or JPA schema file — and it produces INSERT statements where names go in name columns, emails in email columns, and prices in price columns. Install it with npm i -g @otg-dev/seedforge or import it as @otg-dev/seedforge in your tests.
The problem it solves
Section titled “The problem it solves”Hand-written fixtures rot the moment your schema changes. ORM seed scripts balloon into hundreds of lines the first time you add a many-to-many join. Random data breaks foreign key constraints, unique indexes, and CHECK constraints. And “production-like” volumes — the kind you actually need for PR preview environments, load tests, and customer demos — are tedious to hand-craft.
Seedforge replaces all of that with a single command that understands your schema and does the right thing.
How it works
Section titled “How it works”- Introspect — connects to your database (reading
pg_catalog,information_schema, orsqlite_master) or parses your ORM schema files. - Resolve the FK graph — builds a directed graph of foreign keys, detects cycles, and produces a topological insertion order with deferred
UPDATEs for the cycles. - Map columns — matches 190+ column-name patterns across 10 domains (person, contact, location, finance, temporal, etc.) to the right Faker.js generator.
- Generate — produces rows table-by-table in dependency order, threading FK references, enforcing uniqueness, and respecting
NOT NULL,UNIQUE,CHECK, enums, and generated columns. - Insert — writes directly to the database, exports a
.sqlfile, or prints a dry-run summary. PostgreSQL users get a--fastpath usingCOPYfor 10x+ throughput.
When to reach for Seedforge
Section titled “When to reach for Seedforge”- Spinning up a local dev database with data that looks like your real app
- Populating PR preview environments so reviewers can click through real-looking UI
- Seeding test databases for E2E and load tests with deterministic
--seed Noutput - Building demo and sales environments that need thousands of believable rows
- Dry-running a migration against a freshly-populated copy of a schema
When NOT to use it
Section titled “When NOT to use it”- It does not create your tables. Run your migrations first — Seedforge reads the schema you already have.
- It is not an anonymization tool. If you need to scrub real PII out of a production dump, reach for a purpose-built anonymizer. Seedforge generates fresh data from scratch.
- It does not import arbitrary datasets. If you have a specific CSV or JSON dump you need loaded, use
\copyor your ORM’s bulk-insert API.
What makes it different
Section titled “What makes it different”- CI-native. Deterministic output via
--seed 42means the same command produces the same rows across every CI run. - Library API with transaction rollback.
withSeed()gives you a test helper that wraps each run in a transaction and rolls it back on teardown — no state leaks between tests. - Four ORM parsers plus live introspection. Prisma, Drizzle, TypeORM, and JPA schemas work with or without a running database.
- 190+ column patterns. Multi-tier matching (exact → suffix → prefix → regex → semantic stem) means
billing_email,user_email_address, andcontact_emailall get real email addresses. - PostgreSQL fast path.
--fastusesCOPYfor large-dataset workloads where plainINSERTis too slow.