XSS in Trusted-Data Applications

XSS in Trusted-Data Applications

The Lesson

Using innerHTML to render content from "your own" data files (XML, JSON, markdown) is an XSS vulnerability even when the data is self-authored today. The threat model changes when the data pipeline changes: content contributions, bulk imports from external sources, or AI-generated content can all introduce script injection. Sanitize all HTML inserted via innerHTML, regardless of how much you trust the source.

Context

The quiz application rendered question text, scenarios, and hints using innerHTML from parsed XML/JSON data. The data files were all self-authored and stored in the repository. A code review flagged this as an XSS risk despite the trusted-source argument.

The Remediation

  1. A sanitizeHTML() function was added that removes <script>, <style>, <iframe>, <object>, <embed>, and <form> elements, and strips event handler attributes (onclick, onerror, etc.)
  2. All innerHTML assignments in app.js were routed through this sanitizer
  3. A CSP meta tag was added as defense-in-depth
  4. A regression test injects <script>alert(1)</script> in question data and verifies it's not rendered as executable HTML

Key Insights

Related Lessons