Lesson 012: Bayesian Beta-Binomial Smoothing
The Artemis vote system shows 50 random images per ballot and asks voters to pick 5 favorites. With 500 ballots across 1...
The Artemis vote system shows 50 random images per ballot and asks voters to pick 5 favorites. With 500 ballots across 1...
The Artemis pairwise voting mode shows two images side by side and asks "which is better?" This produces binary outcomes...
We have pairwise comparison data (image A beats image B) and want the best possible strength estimates. Bradley-Terry-Lu...
The Artemis category voting mode asks voters to rank their top 3 images within a category. We need to convert these part...
We want to measure whether voters agree on which images are good. With 100 voters and 12,217 images, the voter-image mat...
We have three different types of preference data — batch selection rates, Elo ratings from pairwise comparisons, and Bor...
The scoring pipeline will be re-run as new vote data arrives, as scoring methods are tuned, or as bugs are fixed. Each r...
When images lack text metadata (titles, descriptions, captions), month or season suitability can still be approximated f...
Greedy Maximum Marginal Relevance (MMR) is the practical default for selecting a diverse, high-quality subset from a lar...
When you need to assign N items to N slots where each item-slot pair has a fitness score, the Hungarian algorithm gives ...
We planted known biases in synthetic vote data — 10% of voters had position bias (preferring earlier-displayed images), ...
When an interactive web app needs sub-100ms responses from a scoring function that depends on large lookup tables, load ...
When working with a large image collection from an automated source, assume near-duplicates dominate the pool until prov...
To select k items that maximally represent the diversity within a group, iteratively pick the item most distant from all...
When deduplicating by pairwise similarity, use graph connected components to group items — not naive pair-based merging....
CLIP logits have domain-specific distributions. Converting them to meaningful [0,1] confidence scores requires a sigmoid...
The visual feature extraction pipeline ran sklearn's KMeans on every thumbnail to find 5 dominant colors. Each call took...
We needed to cluster 12,217 Artemis II mission photos into visually distinct groups. The clusters serve a specific downs...
Before choosing a similarity algorithm, understand whether your data uses binary membership (item has feature or doesn't...