How pGenie Works¶
This page explains pGenie's internal architecture so you can understand what happens when you run pgn generate and why the tool is designed the way it is.
The Pipeline¶
Running pgn generate triggers the following stages:
Migrations (SQL) ──┐
├──▶ PostgreSQL container ──▶ Schema + query analysis ──▶ Dhall codegen ──▶ Artifacts
Queries (SQL) ──┘
1. Container startup¶
pGenie uses Testcontainers to spin up a disposable PostgreSQL container on the fly. This is why Docker must be running before you invoke pgn generate. The container is started fresh every time and torn down when generation is complete.
2. Schema application¶
All .sql files in your migrations/ directory are applied to the fresh database in natural sort order by filename. Natural sorting handles embedded numbers correctly — migration-10.sql is applied after migration-9.sql. This means your migrations must be named so that natural sorting produces the correct execution order (e.g. 1.sql, 2.sql, …, or 001.sql, 002.sql, …).
Once applied, the container holds the exact schema your queries will run against.
3. Query analysis¶
For each .sql file in queries/, pGenie:
- Prepares the statement against the live PostgreSQL instance using the extended query protocol. PostgreSQL's
DescribeStatementmessage returns the OIDs of all parameter types and result columns. - Resolves each OID to a full type description - including composite type fields, enumeration labels, and array element types - by querying the system catalog (
pg_type,pg_attribute, etc.). - Determines parameter nullability by consulting the schema and running the query with
NULLsupplied to each parameter and analysing the errors. - Compares the freshly-resolved signature against the existing signature file (
.sig1.pgn.yaml) alongside the.sqlfile, if one exists. If the signatures are compatible (or no sig file exists yet), generation continues; if they differ, the build fails, forcing you to acknowledge the change. When no signature file exists, pGenie writes one with the inferred types. Signature files are intended to be committed alongside your SQL and edited by hand when you want to tighten constraints such as parameter nullability. To force pGenie to regenerate a signature file from scratch, delete it and re-runpgn generate.
4. Code generation¶
pGenie loads the code generators configured in your project1.pgn.yaml. Each generator is a Dhall program referenced by URL (e.g. a raw GitHub URL). The generator receives a structured description of your entire project - schema, queries, type information - and produces a tree of output files.
Dhall is used for code generators because it is:
- Hermetic: Dhall expressions are pure and total. No side effects, no I/O, no hidden state.
- Composable: Generators can import shared utilities and templates.
- Cacheable: Dhall expressions are content-addressed. A generator at a pinned URL is evaluated once, its output hash is stored in
freeze1.pgn.yamland is used as a cache key for subsequent runs. - Secure: Since Dhall cannot perform arbitrary I/O, you can safely run generators from untrusted sources without risking your system. The freeze file ensures that you are running the exact generator you expect, even if the remote URL changes.
The generated files are written to artifacts/<generator-name>/.
Signature Files¶
Signature files (.sig1.pgn.yaml) are an important part of pGenie's design:
- They provide a stable, human-readable record of each query's type signature that can be reviewed in pull requests.
- They make schema drift impossible: on every run, pGenie re-resolves the query signature from the live database and compares it against the committed sig file. If a migration changes the type of a column referenced by a query, pGenie detects the mismatch and fails the build, forcing you to either update the signature file or fix the migration.
Lifecycle of a signature file:
- Created by pGenie the first time
pgn generate(orpgn analyse) is run for a query that has no existing signature file. - Read by pGenie on every subsequent run — pGenie compares the freshly-resolved signature against the committed file and fails the build if they differ.
- Updated manually by you when you want to tighten constraints (e.g. marking a parameter as non-nullable) or when a schema change requires acknowledging a type change.
- Regenerated by pGenie only if you delete the file and re-run generation.
pGenie never silently overwrites an existing signature file.
The Freeze File¶
freeze1.pgn.yaml pins the content hash of every generator URL referenced in project1.pgn.yaml. It is analogous to package-lock.json, Cargo.lock, or cabal.project.freeze.
- When the freeze file exists, pGenie will verify that each downloaded generator matches its recorded hash, ensuring reproducible generation across machines and over time.
- When you want to upgrade a generator, delete the relevant entry from the freeze file (or delete the whole file) and run
pgn generate- the new generator will be fetched and a new hash recorded.
See the Freeze File reference for the full format and lifecycle.
Index Analysis¶
The pgn manage-indexes command keeps your index set lean and correct as the application evolves. It connects to the same ephemeral container used during generation, inspects all existing non-primary indexes from the catalog, and runs EXPLAIN on every query to detect sequential scans. It then produces a SQL migration that:
- Drops indexes that are no longer used by any observed query (unused), are exact duplicates of another index (redundant), or have composite trailing columns that no query needs (excessive).
- Creates new indexes for columns that appear in
WHEREclauses but have no covering index.
This two-sided analysis means you never accumulate stale indexes that slow down writes without benefiting any read query.
Caching¶
pGenie caches several things between runs to keep generation fast:
| What | Where | Invalidated when |
|---|---|---|
| Docker image | Docker's local image store | You purge Docker images |
| Dhall generator bytecode | OS cache directory | Generator URL or hash changes |