Rule-Based Gap Detection Without ML

Seven heuristic rules detect when a RAG corpus can't answer a query, without training data or a classifier — transparent and debuggable, with known trade-offs.

Tags

Rule-Based Gap Detection Without ML

The Lesson

When you need to detect quality gaps in a retrieval system but don't have labeled training data, a set of transparent heuristic rules — each testing one specific signal — is more maintainable and debuggable than a black-box classifier. The rules will be imperfect, but their imperfections are visible and individually fixable.

Context

A RAG chatbot over a corpus of 793 chunks from 116 lessons needed to detect when a user's query exposed a gap — a topic the corpus couldn't answer well. Detected gaps feed into a discovery pipeline that searches GitHub for candidate source material. There was no labeled dataset of "good answers" vs. "gap answers" to train a classifier. The system needed to work immediately with reasonable accuracy, and the maintainer needed to understand and tune why each detection fired.

What Happened

  1. Defined seven independent rules, each testing a different signal from the retrieval results and the generated answer. Each rule either fires or doesn't, appending a reason tag to a list.
  2. Rule 1: No relevant chunks. If the maximum similarity score across all retrieved chunks is below 0.3, the corpus has nothing close to the query. This catches completely novel topics.
  3. Rule 2: Few distinct lessons. If fewer than 2 distinct lessons have chunks above the 0.3 threshold, coverage is thin — one lesson mentioning the topic in passing isn't real coverage.
  4. Rule 3: Related but unanswered. If chunks exist with scores between 0.0 and 0.5, and there are 3+ results, the corpus has related material but doesn't actually answer the question.
  5. Rule 4: Weak answer language. If the LLM's answer contains phrases like "does not appear to contain," "no relevant lessons," or "insufficient information" (10 phrases total), the model is signaling it couldn't ground its answer.
  6. Rule 5: Explicit absence query. If the user asks "do I have any lessons about X" and the answer uses weak language, this combines user intent with model acknowledgment.
  7. Rule 6: Missing platform. If the query mentions a platform keyword (AWS, Docker, Kubernetes, etc.) but none of the retrieved chunks contain that keyword, there's a platform-specific gap.
  8. Rule 7: General knowledge answer. If the answer is longer than 200 characters but fewer than 2 chunks scored above threshold, the model is filling in from general knowledge rather than corpus evidence.
  9. Multiple rules can fire simultaneously. The gap record captures all triggered reasons, enabling analysis of which combinations are most reliable.
  10. Gap records merge by normalized topic (MD5 hash). Repeated queries about the same topic append to additional_queries rather than creating duplicates.

Key Insights

Examples

Rule interaction example:

A query about "Terraform module best practices" against a corpus with no Terraform content:

Applicability

Rule-based gap detection works when:

Consider a trained classifier instead when:

Related Lessons

Related Lessons