The Artemis vote system shows 50 random images per ballot and asks voters to pick 5 favorites. With 500 ballots across 1...
The Artemis pairwise voting mode shows two images side by side and asks "which is better?" This produces binary outcomes...
We have pairwise comparison data (image A beats image B) and want the best possible strength estimates. Bradley-Terry-Lu...
The Artemis category voting mode asks voters to rank their top 3 images within a category. We need to convert these part...
We want to measure whether voters agree on which images are good. With 100 voters and 12,217 images, the voter-image mat...
We have three different types of preference data — batch selection rates, Elo ratings from pairwise comparisons, and Bor...
The scoring pipeline will be re-run as new vote data arrives, as scoring methods are tuned, or as bugs are fixed. Each r...
Several columns in the scoring output have no meaningful value for most images. Only 200 images have Elo scores (from 2,...
DuckDB's executemany with parameterized INSERT statements can hang indefinitely at scale (10K+ rows). Replacing it with ...
When selecting a fixed-size collection where the items must work together (a calendar, a playlist, a portfolio, a menu),...
When images lack text metadata (titles, descriptions, captions), month or season suitability can still be approximated f...
Greedy Maximum Marginal Relevance (MMR) is the practical default for selecting a diverse, high-quality subset from a lar...
When you need to assign N items to N slots where each item-slot pair has a fitness score, the Hungarian algorithm gives ...
When building an optimizer, always generate multiple candidate solutions using different methods — including at least on...
A dependency that's imported in production code but missing from the package manifest is a time bomb. It works on the de...
When a database migration creates a table that new code writes to, the migration must be applied before the code runs — ...
We planted known biases in synthetic vote data — 10% of voters had position bias (preferring earlier-displayed images), ...
We have a calendar optimizer that selects 13 images from 12,217 using a weighted objective function (popularity, diversi...
We know that 20% of synthetic voters are intentionally noisy (10% position-biased, 10% random). We compute Krippendorff'...
When an embedded database (DuckDB, SQLite) serves both a batch pipeline and an interactive web app, the web layer should...
When an interactive web app needs sub-100ms responses from a scoring function that depends on large lookup tables, load ...
A hash-routed single-page application built with vanilla JavaScript, ES modules, and dynamic import() can deliver a func...
When a CLI pipeline and a web API need the same data, import the query functions directly rather than duplicating SQL. A...
A design system built on CSS custom properties (design tokens) can be shared across completely independent frontends — s...
When a linter rule flags code that follows a framework's official pattern, suppress the rule per-line with noqa rather t...
A FastAPI + JavaScript SPA can be deployed to GitHub Pages without rewriting frontend code by using a fetch shim — a sma...
When a project develops on Windows but deploys via CI on Linux, hardcoded paths like D:/artemis/warehouse.duckdb will fa...
The vision tagging pipeline uses Qwen2.5-VL (a 7B-parameter vision-language model) to classify image attributes. Running...
The vision tagging pipeline needs a consistent set of image attributes shared across five components: the vision model p...
Synthetic vote generation needs to produce votes that exhibit detectable attribute-based bias while remaining statistica...
Block-aware statistics need a metric that answers: "does this voting block select images with attribute X more than expe...
The static site serves pre-built JSON files from a public URL. The warehouse database contains voter surrogate keys (vot...
The biased voting blocks pipeline spans six components: config validation, vote generation, attribute analysis, cluster ...
When working with a large image collection from an automated source, assume near-duplicates dominate the pool until prov...
Import heavy dependencies inside the function that uses them, not at module scope. A module-level import numpy means eve...
A single CLIP model, used for zero-shot classification against descriptive text prompts, functions as a general-purpose ...
To select k items that maximally represent the diversity within a group, iteratively pick the item most distant from all...
When the user's mental model is "put this thing in that slot," drag-and-drop is less code and more intuitive than altern...
When deduplicating by pairwise similarity, use graph connected components to group items — not naive pair-based merging....
CLIP logits have domain-specific distributions. Converting them to meaningful [0,1] confidence scores requires a sigmoid...
When adding new features to an existing collection, delete-and-rewrite only the new columns rather than re-processing ev...
Before writing any code for a new feature, produce a written audit of the existing codebase: what exists, what can be re...
Breaking large projects into numbered, independently shippable phases — each with explicit entry criteria, exit criteria...
When working across multiple AI-assisted sessions, continuity must be encoded in files, not in conversation history. A s...
An AI coding assistant that launches background processes (dev servers, database connections, build watchers) will fight...
Gate every commit on a passing test suite, not on "the feature looks done." With 1,500+ tests across a project, the suit...
The visual feature extraction pipeline ran sklearn's KMeans on every thumbnail to find 5 dominant colors. Each call took...
Multiple pipeline stages in Artemis started with per-row INSERT or UPDATE patterns that worked fine during development (...
The Artemis project needed voter preference data to build its statistical models and calendar optimizer. But real vote d...
We needed to cluster 12,217 Artemis II mission photos into visually distinct groups. The clusters serve a specific downs...
The original thumbnail downloader processed 12,217 images sequentially. Each download created a new httpx.Client instanc...
When investigating why multimodal clustering produced zero results, the breakthrough came from a simple query: sql SELEC...
Multimodal clustering required images to have both CLIP image embeddings AND text embeddings. The intersection of these ...
While developing the concurrent thumbnail downloader, the DuckDB warehouse file (warehouse.duckdb) became locked by the ...
Python scripts in the Artemis project span multiple roles: XML/JSON migration, schema validation, metadata enrichment, t...
The original thumbnail downloader worked flawlessly on 5 images during development. When scaled to 12,217 images, it was...
The thumbnail download process was killed multiple times during development — once to change the rate limit, once to adj...
Future lessons identified from the Claude Code usage insights analysis (64 sessions, 150 commits, May 2026). These are p...