Legacy Artifact Removal

Legacy Artifact Removal

The Lesson

After a migration, the old system's artifacts (files, code, tests, scripts) must be actively removed in a deliberate cleanup pass — they don't disappear on their own. The removal is safe only when you can prove the new system is fully operational, and the cleanup itself requires a plan because the old artifacts have tentacles (tests that import them, scripts that process them, docs that reference them).

Context

A quiz application migrated from XML to JSON for its exam data format. The migration was proven correct by 5,101 equivalence tests. But months after JSON became the production format, all 50 XML data files, the XML parser module, 4 XML-specific test files, 10 processing scripts, 33 audit reports, and an XSD schema remained in the repository — over 100 files of dead weight. The XML files were larger than the JSON equivalents, the XML parser was imported by integration tests that didn't need it, and newcomers couldn't tell which format was authoritative.

What Happened

  1. A plan was written listing every XML artifact by category: data files (50), parser code (1), tests (4), scripts (10), audit reports (33), schema (1), and fixture (1).
  2. The one integration test file that tested non-XML functionality but happened to use XMLParser was identified and rewritten to load JSON fixtures directly. This was the only code modification needed.
  3. Tests were run to confirm the rewritten test passed — 28 tests green.
  4. All XML-only files were deleted in bulk: find data -name "*.xml" -delete, then scripts, tests, audit reports, and the XSD.
  5. package.json was updated to remove the validate:xml script. CLAUDE.md and README.md were updated to remove XML references.
  6. Final test run confirmed 133 tests passing (down from 195 — the 62 removed tests were all XML-specific).
  7. Verification grep confirmed zero .xml references in JS or test files.

Key Insights

Applicability

This pattern applies after any migration: database schema changes (old tables), framework upgrades (compatibility shims), API version bumps (deprecated endpoints), language migrations (old source files). The prerequisite is always the same: proof that the new system handles all cases the old system did. Without that proof, deletion is risky; with it, deletion is overdue.

Related Lessons