Artemis

Data science platform for Artemis II 13-month calendar image selection using CLIP embeddings, statistical modeling, and multi-objective optimization.

64 lessons

Project page

Lessons

Lesson 012: Bayesian Beta-Binomial Smoothing

The Artemis vote system shows 50 random images per ballot and asks voters to pick 5 favorites. With 500 ballots across 12,217 images, most images are shown only 1-2 times. A raw selection rate of "1 out of 1 shown = 100%" is meaningless — it tells you nothing about whether the image is actually pref...

Artemis 2026-05-24 algorithms

statistics

Lesson 013: Elo Rating for Image Comparison

The Artemis pairwise voting mode shows two images side by side and asks "which is better?" This produces binary outcomes (winner / loser) for specific pairs, not absolute ratings. We need to convert these relative comparisons into a single continuous strength score per image that can be combined wit...

Artemis 2026-05-24 algorithms

database statistics

Lesson 014: Bradley-Terry-Luce and When to Skip It

We have pairwise comparison data (image A beats image B) and want the best possible strength estimates. Bradley-Terry-Luce (BTL) is the textbook model for this — it's more principled than Elo. But with 2,000 comparisons across 12,217 images, we chose to skip BTL entirely. This lesson explains what B...

Artemis 2026-05-24 algorithms

data-modeling statistics

Lesson 015: Borda Count for Ranked Voting

The Artemis category voting mode asks voters to rank their top 3 images within a category. We need to convert these partial rankings into numeric scores that can be aggregated across voters and combined with batch and pairwise preference signals.

Artemis 2026-05-24 algorithms

statistics

Lesson 016: Krippendorff's Alpha for Sparse Agreement

We want to measure whether voters agree on which images are good. With 100 voters and 12,217 images, the voter-image matrix is >98% missing — most voters never saw most images. Standard agreement metrics require complete matrices. We need a reliability measure that handles extreme sparsity.

Artemis 2026-05-24 algorithms

statistics

Lesson 017: Composite Scoring with Heterogeneous Signals

We have three different types of preference data — batch selection rates, Elo ratings from pairwise comparisons, and Borda scores from category rankings. Each covers a different subset of images, uses a different scale, and captures a different aspect of preference. Most images have data from only o...

Artemis 2026-05-24 algorithms

statistics

Lesson 018: Run-ID Partitioned Scoring

The scoring pipeline will be re-run as new vote data arrives, as scoring methods are tuned, or as bugs are fixed. Each run produces a full set of scores for all 12,217 images. If each run overwrites the previous scores, we lose the ability to compare methods, audit changes, or roll back to a known-g...

Artemis 2026-05-24 algorithms

database pipeline python

Lesson 019: NULL as Honest Missing Data

Several columns in the scoring output have no meaningful value for most images. Only ~200 images have Elo scores (from 2,000 pairwise votes). Only ~150 images have Borda scores (from 250 category rankings). The BTL model wasn't run at all. Fleiss' kappa and Kendall's W can't be computed with incompl...

Artemis 2026-05-24 implementation

database data-modeling python statistics

Lesson 020: DuckDB executemany Hangs — Use PyArrow Bulk Insert

DuckDB's `executemany` with parameterized INSERT statements can hang indefinitely at scale (10K+ rows). Replacing it with a PyArrow table and `INSERT INTO ... SELECT * FROM tbl` completes the same work in under a second. When DuckDB is your warehouse, bulk writes should go through its columnar inges...

Artemis 2026-05-24 data-engineering

database pipeline python statistics

Lesson 021: Calendar as Portfolio Optimization, Not Top-N Ranking

When selecting a fixed-size collection where the items must work together (a calendar, a playlist, a portfolio, a menu), the problem is constrained set optimization — not top-N ranking. The best collection often contains none of the individually top-ranked items, because collection-level properties...

Artemis 2026-05-24 implementation

ai statistics

Lesson 022: Heuristic Month-Fit Scoring Without Text Metadata

When images lack text metadata (titles, descriptions, captions), month or season suitability can still be approximated from visual features alone — color temperature, brightness, contrast, and content flags. The signal is coarse (3-4 seasonal buckets, not 13 distinct months) but sufficient to preven...

Artemis 2026-05-24 algorithms

statistics

Lesson 023: Maximum Marginal Relevance for Diversity-Aware Selection

Greedy Maximum Marginal Relevance (MMR) is the practical default for selecting a diverse, high-quality subset from a large pool. At each step, it picks the item that maximizes quality minus similarity to already-selected items. It runs in O(K × N) time, requires no optimization library, and naturall...

Artemis 2026-05-24 algorithms

serverless python ai

Lesson 024: Hungarian Algorithm for Optimal Assignment

When you need to assign N items to N slots where each item-slot pair has a fitness score, the Hungarian algorithm gives the provably optimal assignment in O(N^3) time. For small N (≤50), it runs in microseconds and eliminates the need for greedy heuristics, manual tuning, or iterative search. Use sc...

Artemis 2026-05-24 algorithms

pipeline python

Lesson 025: Multiple Selection Methods as Baselines

When building an optimizer, always generate multiple candidate solutions using different methods — including at least one naive baseline. The baseline proves the optimizer adds value. The alternatives expose the trade-off frontier. Without baselines, you can't distinguish "good optimization" from "e...

Artemis 2026-05-24 implementation

statistics

Lesson 026: Formalizing De Facto Dependencies

A dependency that's imported in production code but missing from the package manifest is a time bomb. It works on the developer's machine (where the package was installed for something else) and fails on fresh installs, CI, or new team members. Audit imports against declared dependencies whenever ad...

Artemis 2026-05-24 implementation

testing python

Lesson 027: Migration Ordering and Apply-on-Use Gaps

When a database migration creates a table that new code writes to, the migration must be applied before the code runs — not just before the next CLI invocation. If the code path that triggers the write doesn't call `apply_migrations()`, the table won't exist at runtime, even though the migration fil...

Artemis 2026-05-24 data-engineering

database deployment data-modeling pipeline

Lesson 028: Chi-Squared Tests for Bias Detection at Small Scale

We planted known biases in synthetic vote data — 10% of voters had position bias (preferring earlier-displayed images), 20% had visual-drama bias (preferring dramatic images). We need statistical tests that can detect these biases with only 100 voters and 500 ballots, without requiring heavy statist...

Artemis 2026-05-24 algorithms

statistics

Lesson 029: Ground-Truth Recovery as Optimizer Validation

We have a calendar optimizer that selects 13 images from 12,217 using a weighted objective function (popularity, diversity, month-fit, cover-fit, redundancy penalty). The optimizer reports an objective score, but a high score doesn't prove the optimizer is selecting the *right* images — it could be...

Artemis 2026-05-24 testing

pipeline statistics

Lesson 030: Reliability Delta as Noise Measurement

We know that 20% of synthetic voters are intentionally noisy (10% position-biased, 10% random). We compute Krippendorff's alpha on all voters and get a moderate value (~0.52). But how much of the low agreement is caused by these noisy voters vs. genuine preference diversity among neutral voters? We...

Artemis 2026-05-24 implementation

database pipeline statistics

Lesson 031: Read-Only DB Connections for Web Layers

When an embedded database (DuckDB, SQLite) serves both a batch pipeline and an interactive web app, the web layer should open the database in read-only mode. This avoids writer-lock conflicts entirely and makes the architecture self-documenting: the web app *cannot* mutate the warehouse, by construc...

Artemis 2026-05-24 data-engineering

database pipeline python

Lesson 032: Startup Cache for Interactive Scoring

When an interactive web app needs sub-100ms responses from a scoring function that depends on large lookup tables, load those tables into memory at startup rather than querying the database per request. The cache size is bounded (you know exactly what's in the warehouse), startup cost is a one-time...

Artemis 2026-05-24 algorithms

database pipeline python ai

Lesson 033: Vanilla JS SPA Without a Build Step

A hash-routed single-page application built with vanilla JavaScript, ES modules, and dynamic `import()` can deliver a functional multi-page experience — navigation, pagination, filtering, modals, live API calls — with zero build toolchain. For internal tools and single-user apps, this eliminates npm...

Artemis 2026-05-24 frontend

database security pipeline python javascript

Lesson 034: Reusing Query Modules Across CLI and Web

When a CLI pipeline and a web API need the same data, import the query functions directly rather than duplicating SQL. Add the serialization layer (Pydantic models, JSON responses) at the API boundary, not in the query module. The query module returns plain Python objects (dataclasses, dicts, tuples...

Artemis 2026-05-24 data-engineering

database data-modeling pipeline python

Lesson 035: Design System Portability via Tokens

A design system built on CSS custom properties (design tokens) can be shared across completely independent frontends — static HTML pages, vanilla JS SPAs, embedded widgets — by copying two files. The tokens provide visual consistency without requiring a shared component library, a build system, or a...

Artemis 2026-05-24 architecture

python

Lesson 036: Linter Rules vs. Framework Idioms

When a linter rule flags code that follows a framework's official pattern, suppress the rule per-line with `noqa` rather than restructuring the code. Linter rules encode general best practices; framework idioms encode domain-specific patterns that intentionally violate those practices. Restructuring...

Artemis 2026-05-24 implementation

security pipeline python

Lesson 037: Static Site Generation via Fetch Shim

A FastAPI + JavaScript SPA can be deployed to GitHub Pages without rewriting frontend code by using a **fetch shim** — a small JavaScript interceptor injected into `index.html` that redirects API calls to pre-generated JSON files and handles filtering, sorting, and pagination client-side. The build...

Artemis 2026-05-24 frontend

database security deployment api frontend

Lesson 038: CI Path Portability and Release Artifacts

When a project develops on Windows but deploys via CI on Linux, hardcoded paths like `D:/artemis/warehouse.duckdb` will fail silently or crash. Every path that differs between dev and CI must be configurable via environment variable. Similarly, large binary dependencies (databases, model weights) sh...

Artemis 2026-05-24 deployment

database deployment pipeline

Lesson 039: Mock Tagger Pattern for Vision Pipeline Testing

The vision tagging pipeline uses Qwen2.5-VL (a 7B-parameter vision-language model) to classify image attributes. Running the real model requires a GPU, takes seconds per image, and produces non-deterministic outputs. The full pipeline — config loading, tagging, derived label computation, DB persiste...

Artemis 2026-05-24 data-engineering

testing api pipeline ai statistics

Lesson 040: Controlled Vocabulary as Schema Contract

The vision tagging pipeline needs a consistent set of image attributes shared across five components: the vision model prompt, the attribute parser/validator, the database schema, the voting block config, and the cluster labeling engine. If any component uses an attribute code the others don't recog...

Artemis 2026-05-24 architecture

database data-modeling pipeline python

Lesson 041: Utility Function Design for Synthetic Voting Bias

Synthetic vote generation needs to produce votes that exhibit detectable attribute-based bias while remaining statistically plausible. A biased voter block that always votes for images with specific attributes produces trivially detectable (and unrealistic) bias. A block with too much noise produces...

Artemis 2026-05-24 implementation

data-modeling pipeline statistics

Lesson 042: Lift as the Primary Bias Detection Metric

Block-aware statistics need a metric that answers: "does this voting block select images with attribute X more than expected?" Raw selection counts don't work because blocks have different sizes. Rate differences (block rate - global rate) are hard to interpret when base rates vary widely. The metri...

Artemis 2026-05-24 implementation

pipeline statistics

Lesson 043: PII Sanitization in Static JSON Exports

The static site serves pre-built JSON files from a public URL. The warehouse database contains voter surrogate keys (`voter_sk`), hashed voter IDs (`voter_public_hash`), random seeds, config hashes, and local file paths. None of these should appear in public-facing JSON. The sanitization must be rel...

Artemis 2026-05-24 security

database security data-modeling

Lesson 044: Acceptance Tests as Executable Specifications

The biased voting blocks pipeline spans six components: config validation, vote generation, attribute analysis, cluster analysis, score/calendar impact, and static export. Unit tests cover each component in isolation, but the interesting behaviors — "does a biased block produce detectable lift in th...

Artemis 2026-05-24 testing

database testing data-modeling pipeline python

Lesson 045: Embedding-Based Deduplication for Image Collection Curation

When working with a large image collection from an automated source, assume near-duplicates dominate the pool until proven otherwise. Embedding cosine similarity with connected-component grouping reduces a collection to its unique members in minutes, but the threshold choice dramatically affects the...

Artemis 2026-05-24 algorithms

Lesson 046: Lazy Imports for Deployment Compatibility

Import heavy dependencies inside the function that uses them, not at module scope. A module-level `import numpy` means every consumer of that module — including lightweight build scripts, CI pipelines, and serverless functions — must have numpy installed, even if they never call the code path that n...

Artemis 2026-05-24 deployment

serverless database deployment pipeline python

Lesson 047: CLIP Zero-Shot as a Database Column Factory

A single CLIP model, used for zero-shot classification against descriptive text prompts, functions as a general-purpose column generator for structured databases. Each new prompt produces a new confidence column — no training, no fine-tuning, no labeled data. The cost of adding a column is one forwa...

Artemis 2026-05-24 implementation

database deployment data-modeling pipeline

Lesson 048: Greedy Max-Min Diversity Selection

To select k items that maximally represent the diversity within a group, iteratively pick the item most distant from all already-selected items. This greedy max-min approach is O(n×k), produces near-optimal diversity in practice, and avoids the NP-hard max-dispersion problem entirely.

Artemis 2026-05-24 algorithms

serverless ai

Lesson 049: Drag-and-Drop as the Simplest Viable Interaction

When the user's mental model is "put this thing in that slot," drag-and-drop is less code and more intuitive than alternatives like dropdowns, search dialogs, or multi-step wizards. The key is spatial co-visibility: the source pool and target slots must be on screen simultaneously so the user can se...

Artemis 2026-05-24 frontend

Lesson 050: Connected Components for Transitive Deduplication

When deduplicating by pairwise similarity, use graph connected components to group items — not naive pair-based merging. Pairwise similarity is not transitive in theory (A~B and B~C doesn't guarantee A~C), but for near-duplicates in practice, transitivity holds and connected components correctly gro...

Artemis 2026-05-24 algorithms

database ai

Lesson 051: Sigmoid Calibration for Domain-Specific CLIP Scores

CLIP logits have domain-specific distributions. Converting them to meaningful [0,1] confidence scores requires a sigmoid transform calibrated to the actual logit range in your image collection. A universal threshold doesn't work — the sigmoid center and scale must be tuned empirically by examining l...

Artemis 2026-05-24 algorithms

database data-modeling

Lesson 052: Incremental Feature Extraction Over Full Re-runs

When adding new features to an existing collection, delete-and-rewrite only the new columns rather than re-processing everything. The key enabler is tagging each row with its source (model version, label source, attribute code) so that surgical deletes and inserts are possible without touching exist...

Artemis 2026-05-24 implementation

database data-modeling

Lesson 053: Audit-First Design

Before writing any code for a new feature, produce a written audit of the existing codebase: what exists, what can be reused, where new code slots in. The audit document prevents reimplementing existing functionality and identifies the exact extension points — saving more time than it costs to write...

Artemis 2026-05-24 process

database pipeline statistics

Lesson 054: Phased Autonomous Execution Plans

Breaking large projects into numbered, independently shippable phases — each with explicit entry criteria, exit criteria, and a commit checkpoint — transforms ambitious multi-session work from a coordination problem into a queue of self-contained tasks. The plan file is both the work instruction and...

Artemis 2026-05-24 process

testing deployment pipeline

Lesson 055: Session Continuity via Documentation Artifacts

When working across multiple AI-assisted sessions, continuity must be encoded in files, not in conversation history. A startup document, a plan file with status tracking, and a project CLAUDE.md that reflects current state eliminate ramp-up overhead and prevent context loss from session clears and c...

Artemis 2026-05-24 implementation

testing data-modeling

Lesson 056: Environment Self-Interference in AI-Assisted Development

An AI coding assistant that launches background processes (dev servers, database connections, build watchers) will fight with its own previous instances over shared resources like ports and file locks. Explicit cleanup before each launch — kill orphan processes, release locks, verify port availabili...

Artemis 2026-05-24 deployment

database deployment python docker

Lesson 057: Test-Gated Commits at Scale

Gate every commit on a passing test suite, not on "the feature looks done." With 1,500+ tests across a project, the suite catches regressions that visual inspection misses — wrong column names, broken imports, type mismatches, off-by-one errors. The test suite is the contract for "this commit is saf...

Artemis 2026-05-24 testing

testing python statistics

Lesson 058: DuckDB Cursor-Per-Request for Concurrent Web Handlers

When serving DuckDB through a multi-threaded web framework (FastAPI/uvicorn), never share a single connection object across concurrent request handlers. Instead, call `conn.cursor()` to create a per-request cursor. DuckDB's Python driver does not support concurrent queries on the same connection fro...

Artemis 2026-05-24 data-engineering

api python

Lesson 059: Derive Metrics from Immutable Fact Tables, Not Mutable State

When a dashboard metric can be computed either from a mutable state flag or from an immutable record table, always derive it from the immutable source. Mutable flags reflect the *current* state, which may not be the state your metric is trying to describe. Immutable fact tables preserve the *histori...

Artemis 2026-05-24 data-engineering

pipeline

Lesson 060: Context Blocks Turn a Tool Into a Teaching Artifact

Adding a brief "why this page matters" block at the top of every page in a data application transforms it from an internal tool into a self-guided case study. A single sentence of context lets a reviewer understand what they're looking at without reading documentation or having the author present to...

Artemis 2026-05-24 implementation

pipeline statistics

Lesson 061: Centralize Project Metadata to Prevent Count Drift

When the same project-level number (image count, cluster count, lesson count) appears in multiple frontend modules, centralize it in a single metadata object. Better still, fetch live counts from the API at render time and use the centralized constant only as a fallback. Hardcoded numbers scattered...

Artemis 2026-05-24 implementation

javascript ai

Lesson 062: A Guided Reviewer Path for Portfolio Projects

Add a numbered "review this project in N minutes" path to the homepage of any portfolio project or case study. Without explicit guidance, reviewers wander randomly through pages and miss the strongest parts of the work. A curated path ensures every reviewer sees the same narrative arc, regardless of...

Artemis 2026-05-24 implementation

pipeline statistics

Lesson 063: Promise.all as an Accidental Concurrency Test

Any frontend page that fires multiple `fetch()` calls via `Promise.all()` is an implicit concurrency test for the backend. If your API endpoints work individually via `curl` but fail when the browser loads a page that hits them simultaneously, you have a shared-state concurrency bug — not a data or...

Artemis 2026-05-24 testing

database testing api python javascript

Lesson 064: Noscript Fallback as SEO Baseline for SPAs

A single-page application rendered entirely in JavaScript is invisible to search engine crawlers that don't execute JS. Adding a `<noscript>` block with the project's core content — title, summary, key links, and attribution — provides a crawlable baseline that costs minutes to implement and ensures...

Artemis 2026-05-24 implementation

testing deployment frontend pipeline javascript

Lesson: Algorithm Selection — sklearn KMeans vs Pillow Quantize for Dominant Colors

The visual feature extraction pipeline ran sklearn's `KMeans` on every thumbnail to find 5 dominant colors. Each call took ~147ms per image. For 12,217 images, that's ~30 minutes of CPU time on dominant color extraction alone — a feature that contributes a single JSON column to the feature table.

Artemis 2026-05-24 algorithms

testing pipeline python

Lesson: Batch Database Operations — INSERT/UPDATE Patterns at Scale

Multiple pipeline stages in Artemis started with per-row INSERT or UPDATE patterns that worked fine during development (5-10 rows) but became bottlenecks at full scale (12,000+ rows). The per-row pattern appeared in three places:

Artemis 2026-05-24 data-engineering

database security pipeline python ai

Lesson: Building with Synthetic Data Before Real Data Arrives

The Artemis project needed voter preference data to build its statistical models and calendar optimizer. But real vote data from ArtemisTimeline.com wasn't yet available — the vote export hadn't been requested, and the site's API only exposes aggregate leaderboards, not raw ballots.

Artemis 2026-05-24 implementation

data-modeling pipeline python statistics

Lesson: Choosing k for k-Means Clustering in a Constrained Selection Problem

We needed to cluster 12,217 Artemis II mission photos into visually distinct groups. The clusters serve a specific downstream purpose: ensuring the final 13-image calendar selection has visual diversity. The choice of k (number of clusters) directly affects whether the optimization can produce a goo...

Artemis 2026-05-24 algorithms

Lesson: Concurrent HTTP Downloads with Connection Pooling

The original thumbnail downloader processed 12,217 images sequentially. Each download created a new `httpx.Client` instance, which meant a fresh TCP connection and TLS handshake for every single request — all to the same Cloudflare R2 CDN endpoint. At 0.1s rate limiting plus ~50-200ms connection ove...

Artemis 2026-05-24 data-engineering

cloudflare database pipeline python

Lesson: Debugging with Surrogate Key Ranges

When investigating why multimodal clustering produced zero results, the breakthrough came from a simple query:

Artemis 2026-05-24 data-engineering

database data-modeling pipeline ai

Lesson: Disjoint Data Populations Breaking Multimodal Joins

Multimodal clustering required images to have both CLIP image embeddings AND text embeddings. The intersection of these two sets was empty — 0 images qualified. The clustering silently logged a warning and returned 0 results. The pipeline appeared to work, but an entire analysis dimension produced n...

Artemis 2026-05-24 implementation

database pipeline python

Lesson: DuckDB Single-Writer Constraint in Concurrent Pipelines

While developing the concurrent thumbnail downloader, the DuckDB warehouse file (`warehouse.duckdb`) became locked by the download process. Any attempt to check progress, run `artemis-pipeline status`, or open a second connection failed with:

Artemis 2026-05-24 data-engineering

database pipeline python ai

Lesson: PEP 8 Compliance in Data Engineering Pipelines

Python scripts in the Artemis project span multiple roles: XML/JSON migration, schema validation, metadata enrichment, test harnesses, and lesson harvesting. Without a consistent style standard, each script drifts toward the author's (or AI assistant's) habits — camelCase here, inconsistent indentat...

Artemis 2026-05-24 data-engineering

database data-modeling pipeline python javascript

Lesson: Per-Record Overhead That Doesn't Matter at 10 Rows Kills You at 12,000

The original thumbnail downloader worked flawlessly on 5 images during development. When scaled to 12,217 images, it was unacceptably slow — not because of network latency, but because of per-image overhead that was invisible at small scale.

Artemis 2026-05-24 implementation

pipeline python

Lesson: Resume-Safe Pipeline Design — Surviving Interrupts and Partial Failures

The thumbnail download process was killed multiple times during development — once to change the rate limit, once to adjust the timeout, once at the user's request. Each time, the question was: how much progress was lost? Can we pick up where we left off?

Artemis 2026-05-24 data-engineering

database data-modeling pipeline python

Artemis

Tags

Lessons