Skip to content

How pGenie Works

This page explains pGenie's internal architecture so you can understand what happens when you run pgn generate and why the tool is designed the way it is.


The Pipeline

Running pgn generate triggers the following stages:

Migrations (SQL)  ──┐
                    ├──▶  PostgreSQL container  ──▶  Schema + query analysis  ──▶  Dhall codegen  ──▶  Artifacts
Queries (SQL)     ──┘

1. Container startup

pGenie uses Testcontainers to spin up a disposable PostgreSQL container on the fly. This is why Docker must be running before you invoke pgn generate. The container is started fresh every time and torn down when generation is complete.

2. Schema application

All .sql files in your migrations/ directory are applied to the fresh database in natural sort order by filename. Natural sorting handles embedded numbers correctly — migration-10.sql is applied after migration-9.sql. This means your migrations must be named so that natural sorting produces the correct execution order (e.g. 1.sql, 2.sql, …, or 001.sql, 002.sql, …).

Once applied, the container holds the exact schema your queries will run against.

3. Query analysis

For each .sql file in queries/, pGenie:

  1. Prepares the statement against the live PostgreSQL instance using the extended query protocol. PostgreSQL's DescribeStatement message returns the OIDs of all parameter types and result columns.
  2. Resolves each OID to a full type description - including composite type fields, enumeration labels, and array element types - by querying the system catalog (pg_type, pg_attribute, etc.).
  3. Determines parameter nullability by consulting the schema and running the query with NULL supplied to each parameter and analysing the errors.
  4. Compares the freshly-resolved signature against the existing signature file (.sig1.pgn.yaml) alongside the .sql file, if one exists. If the signatures are compatible (or no sig file exists yet), generation continues; if they differ, the build fails, forcing you to acknowledge the change. When no signature file exists, pGenie writes one with the inferred types. Signature files are intended to be committed alongside your SQL and edited by hand when you want to tighten constraints such as parameter nullability. To force pGenie to regenerate a signature file from scratch, delete it and re-run pgn generate.

4. Code generation

pGenie loads the code generators configured in your project1.pgn.yaml. Each generator is a Dhall program referenced by URL (e.g. a raw GitHub URL). The generator receives a structured description of your entire project - schema, queries, type information - and produces a tree of output files.

Dhall is used for code generators because it is:

  • Hermetic: Dhall expressions are pure and total. No side effects, no I/O, no hidden state.
  • Composable: Generators can import shared utilities and templates.
  • Cacheable: Dhall expressions are content-addressed. A generator at a pinned URL is evaluated once, its output hash is stored in freeze1.pgn.yaml and is used as a cache key for subsequent runs.
  • Secure: Since Dhall cannot perform arbitrary I/O, you can safely run generators from untrusted sources without risking your system. The freeze file ensures that you are running the exact generator you expect, even if the remote URL changes.

The generated files are written to artifacts/<generator-name>/.


Signature Files

Signature files (.sig1.pgn.yaml) are an important part of pGenie's design:

  • They provide a stable, human-readable record of each query's type signature that can be reviewed in pull requests.
  • They make schema drift impossible: on every run, pGenie re-resolves the query signature from the live database and compares it against the committed sig file. If a migration changes the type of a column referenced by a query, pGenie detects the mismatch and fails the build, forcing you to either update the signature file or fix the migration.

Lifecycle of a signature file:

  1. Created by pGenie the first time pgn generate (or pgn analyse) is run for a query that has no existing signature file.
  2. Read by pGenie on every subsequent run — pGenie compares the freshly-resolved signature against the committed file and fails the build if they differ.
  3. Updated manually by you when you want to tighten constraints (e.g. marking a parameter as non-nullable) or when a schema change requires acknowledging a type change.
  4. Regenerated by pGenie only if you delete the file and re-run generation.

pGenie never silently overwrites an existing signature file.


The Freeze File

freeze1.pgn.yaml pins the content hash of every generator URL referenced in project1.pgn.yaml. It is analogous to package-lock.json, Cargo.lock, or cabal.project.freeze.

  • When the freeze file exists, pGenie will verify that each downloaded generator matches its recorded hash, ensuring reproducible generation across machines and over time.
  • When you want to upgrade a generator, delete the relevant entry from the freeze file (or delete the whole file) and run pgn generate - the new generator will be fetched and a new hash recorded.

See the Freeze File reference for the full format and lifecycle.


Index Analysis

The pgn manage-indexes command keeps your index set lean and correct as the application evolves. It connects to the same ephemeral container used during generation, inspects all existing non-primary indexes from the catalog, and runs EXPLAIN on every query to detect sequential scans. It then produces a SQL migration that:

  • Drops indexes that are no longer used by any observed query (unused), are exact duplicates of another index (redundant), or have composite trailing columns that no query needs (excessive).
  • Creates new indexes for columns that appear in WHERE clauses but have no covering index.

This two-sided analysis means you never accumulate stale indexes that slow down writes without benefiting any read query.


Caching

pGenie caches several things between runs to keep generation fast:

What Where Invalidated when
Docker image Docker's local image store You purge Docker images
Dhall generator bytecode OS cache directory Generator URL or hash changes