Criterion 10

Semantic HTML5 & Accessibility

Using proper HTML5 elements and accessibility attributes so AI systems can understand your content's structure, hierarchy, and meaning.

What It Is

Semantic HTML5 means using HTML elements that describe the meaning of content, not just how it looks. Instead of generic <div> containers, semantic elements tell machines what role each piece of content plays:

- `<main>` — The primary content of the page - `<article>` — A self-contained piece of content (blog post, product, comment) - `<section>` — A thematic grouping of content - `<nav>` — Navigation links - `<aside>` — Sidebar or tangentially related content - `<header>` / `<footer>` — Introductory or closing content for a section - `<figure>` / `<figcaption>` — Images with descriptive captions - `<time>` — Dates and times in machine-readable format - `<h1>` through `<h6>` — Content hierarchy (one H1 per page, logical nesting)

Accessibility attributes (ARIA labels, alt text, role attributes) further enhance machine understanding.

Why It Matters for AEO

AI systems parse HTML structure to understand content. Semantic elements are the roadmap:

- **Content Extraction**: AI crawlers use <main> and <article> to find the actual content, skipping navigation, footers, and sidebars - **Heading Hierarchy**: H1 → H2 → H3 creates an outline that AI uses to understand topic structure. A page with no H1 or broken hierarchy confuses AI parsers. - **Image Understanding**: Alt text is the only way AI systems "see" your images. Empty alt="" means the image is invisible to AI. - **Date Recognition**: <time datetime="2026-01-15"> gives AI an exact, parseable date instead of ambiguous text - **Content Boundaries**: <article> tells AI where one piece of content ends and another begins — critical for blog listings and product grids - **Accessibility = AI-readability**: The same attributes that help screen readers also help AI systems understand your content

How to Implement

**1. Use one H1 per page** ```html <!-- Every page needs exactly one H1 --> <h1>Your Primary Page Title</h1>

<!-- Sub-sections use H2, sub-sub-sections use H3 --> <h2>Major Section</h2> <h3>Sub-section</h3> ```

**2. Wrap content in semantic elements** ```html <main> <article> <header> <h1>Article Title</h1> <time datetime="2026-01-15">January 15, 2026</time> <span>By Author Name</span> </header> <section> <h2>First Section</h2> <p>Content here...</p> <figure> <img src="photo.jpg" alt="Descriptive text about the image"> <figcaption>Caption explaining the image context</figcaption> </figure> </section> </article> </main> ```

**3. Write descriptive alt text** ```html <!-- Bad --> <img src="album.jpg" alt=""> <img src="album.jpg" alt="image">

<!-- Good --> <img src="album.jpg" alt="Miles Davis - Kind of Blue original 1959 Columbia pressing, vinyl and jacket in VG+ condition"> ```

**4. Fix heading hierarchy** Use browser extensions or Lighthouse to audit heading structure. Every page should have: - Exactly one H1 - H2s for major sections - H3s nested under H2s (never skip levels)

**5. Add figcaption to images** Especially for product images and article photos — captions provide additional context that alt text alone doesn't cover.

Common Mistakes

- No H1 on the homepage (or multiple H1s on a single page) - Skipping heading levels (H1 → H3, missing H2) - Empty alt="" on meaningful images (only decorative images should have empty alt) - Using <div> and <span> for everything instead of semantic elements - Blog listing titles in <p> or <div> instead of heading tags - Not using <time> elements with datetime attributes for dates - Typos in visible headings that undermine credibility

External Resources

- MDN Web Docs: HTML elements reference - web.dev/learn/html/semantic-html — Google's semantic HTML guide - WAVE Web Accessibility Evaluator — Test accessibility - Lighthouse — Audit SEO, accessibility, and performance