Lesson 053: Audit-First Design

Lesson 053: Audit-First Design

The Lesson

Before writing any code for a new feature, produce a written audit of the existing codebase: what exists, what can be reused, where new code slots in. The audit document prevents reimplementing existing functionality and identifies the exact extension points — saving more time than it costs to write.

Context

A data science platform with 15+ modules, 168 tests, and 29 database tables needed a major extension: biased voting blocks, vision-language model tagging, block-aware statistics, and static site integration. The extension touched nearly every layer of the stack — synthetic data generation, feature extraction, statistical modeling, validation, web API, and frontend. Without an audit, the risk was high: reimplementing query patterns that already existed, creating parallel data paths that diverged from existing conventions, or missing reusable infrastructure buried in modules the developer hadn't recently touched.

What Happened

  1. Created a formal audit document (docs/artemis_bias_extension_audit.md) as the first commit of the project — before any code changes. The audit covered:

    • synthetic/ — existing voter profiles, vote generation, run metadata, extension points
    • features/ — visual features, CLIP embeddings, text features, how to add new extractors
    • models/ — Elo, Borda, Beta-Binomial, composite scoring, where block-aware variants slot in
    • validate/ — bias detection, calendar validation, extension points for block-level bias
    • static/ — JSON export, stats page, how new sections integrate
    • web/ — API routes, frontend architecture, CLI command registration
  2. The audit identified three cases where existing code could be reused directly:

    • The observe/run_manifest.py run tracking system — no need to build a new one for block vote runs
    • The PyArrow bulk insert pattern from features/embeddings.py — reusable for attribute batch writes
    • The config/sql_helpers.py run-resolution subqueries — reusable for block statistics queries
  3. The audit identified two patterns that needed deliberate deviation:

    • Vote generation used executemany for small batches but the block generator would produce 50K+ ballot-image rows. The audit flagged this and led to using PyArrow bulk insert instead (preventing the hanging bug from Lesson 020).
    • The existing bias detection in validate/ used chi-squared tests on position bias. Block-aware bias detection needed lift-based metrics — a different statistical approach. The audit clarified this wasn't a case of "extend the existing function" but "add a parallel analysis module."
  4. The audit took 15 minutes to write. The subsequent 10 implementation phases (vision tagging, voting blocks, statistics, export, web integration, acceptance tests) completed without any "oh, this already exists" rework moments. Three developers reviewing the code confirmed they found no dead code or reimplemented functionality.

Key Insights

Applicability

Audit-first design is valuable when:

Does NOT apply when:

Related Lessons