Technical AuditCriterion T9

Table & List Extractability

Your comparison table looks great in the browser. But it's built with divs and CSS Grid, so ChatGPT sees a blob of text. Here's what that costs you.

Published February 14, 2026

medium effortmedium impact

Questions this article answers

?Why can't ChatGPT read my comparison table built with CSS Grid?
?What HTML tags should I use for tables and lists so AI engines can extract them?
?How does semantic HTML for tables affect AI search citations?

Summarize This Article With AI

Open this article in your preferred AI engine for an instant summary and analysis.

ChatGPT Perplexity Google AI

How Engines Extract Tables & Lists

ChatGPT

•Prefers HTML tables
•Extracts comparison data
•Cites tabular facts
•Struggles with image tables

Claude

•Reads nested lists well
•Parses definition lists
•Prefers semantic markup
•Handles complex tables

aeocontent.ai

Quick Answer

Table and list extractability measures whether your structured content uses semantic HTML -tables with thead/tbody/th, lists with ol/ul/li. AI engines restructure tables and lists into their answers, but only when the HTML is properly formed. Div-based tables are invisible as structured content.

Before & After

Before - Div-based table, invisible to AI

<div class="grid grid-cols-3">
  <div class="font-bold">Feature</div>
  <div class="font-bold">Basic</div>
  <div class="font-bold">Pro</div>
  <div>Live Chat</div>
  <div>Yes</div>
  <div>Yes</div>
</div>

After - Semantic HTML table, extractable

<table>
  <caption>Plan Comparison</caption>
  <thead>
    <tr><th>Feature</th><th>Basic</th><th>Pro</th></tr>
  </thead>
  <tbody>
    <tr><td>Live Chat</td><td>Yes</td><td>Yes</td></tr>
  </tbody>
</table>

What This Actually Measures

We're measuring the quality of your tabular and list content from a machine-parsing perspective. When AI engines encounter comparison queries ("X vs Y," "best tools for Z," "pros and cons of..."), they search for structured content they can extract and reformulate. Well-formed HTML tables and lists are the primary extraction targets.

Three content categories get evaluated. Tabular data: comparison tables, pricing tables, feature matrices, spec sheets. Ordered lists: step-by-step instructions, rankings, numbered procedures. Unordered lists: feature lists, pros/cons, bullet-pointed benefits. Each is assessed for semantic HTML correctness and extraction readiness.

For tables, we check whether the HTML uses proper <table>, <thead>, <tbody>, <th>, and <td> elements with appropriate scope attributes. Tables built from CSS Grid or Flexbox with <div> elements -a common pattern in modern web dev -are counted as "visually tabular" but marked as non-extractable. Put on Claude's glasses: it sees a pile of divs, not a table.

For lists, we verify <ol> and <ul> with <li> elements. Content presenting items in list format visually (line breaks, dashes, custom CSS) but without list HTML is flagged as non-extractable. We also check whether list items contain enough content to be meaningful -single-word items with no context provide less citation value than descriptive items with explanatory text.

The primary metric: "extractable structured content ratio" -the percentage of pages with visual tables or lists that actually use proper semantic HTML.

Why Div-Based Tables Are Invisible to AI

Comparison and list queries are among the highest-value query types. When someone asks "What are the best live chat tools?" or "Compare Intercom vs Zendesk features," the AI constructs a structured response -typically a table or bulleted list. The AI's preferred source is content already in an extractable format.

Here's what ChatGPT actually sees: when your comparison table uses proper HTML table elements, AI systems parse the rows and columns, understand the relationship between headers and data cells, and restructure the information into their response with a citation. When the same comparison is built with CSS Grid divs, the AI sees a block of text without structural meaning and tries to reconstruct the tabular relationship from visual cues. That process frequently fails.

The impact is measurable. Analysis of AI-generated comparison responses shows that 78% of cited sources for tabular data use proper HTML table elements. Sources using div-based layouts get cited at less than half that rate for the same content. The HTML structure isn't just a best practice -it's a competitive differentiator for AI visibility.

At the site level, consistent semantic tables and lists build a pattern AI systems recognize. A domain known for cleanly structured comparison data becomes a preferred source for comparison queries. This is especially valuable for product review sites, B2B comparison platforms, and knowledge bases where tabular content is core -exactly the kind of content we audit across the live chat vertical.

How We Check This

Two-pass analysis on each page. First pass identifies all visual occurrences of tabular and list content regardless of HTML implementation. Second pass evaluates whether those structures use proper semantic HTML.

First pass detection: for tables, we find <table> elements, CSS Grid containers with row/column patterns, Flexbox containers with repeating child structures, and div-based layouts with table styling. For lists: <ol>/<ul> elements, div containers with repeating single-item children, paragraphs with numbered prefixes (1., 2., 3.), and text blocks with dash or bullet prefixes.

Second pass classification: a table is "semantically correct" when it uses <table> with at least <th> header cells in the first row. Bonus points for <thead>/<tbody> separation, scope="col" or scope="row" attributes, and <caption> elements. A table built from divs -regardless of how polished it looks -is "semantically broken."

For lists, semantic correctness requires <ol> or <ul> with <li> children. Nested lists need proper nesting (<li> containing a child <ul>/<ol>). Lists built from <div> elements with CSS bullets, <br>-separated items, or paragraph-based numbering are semantically broken.

We also evaluate content quality. Tables with fewer than 2 columns or 2 rows are trivially simple. Lists with fewer than 3 items are too short. Empty or icon-only header cells are non-descriptive. The audit produces a page-by-page inventory of all structured content elements with their semantic status, quality rating, and fixes.

How We Score It

Extractability scoring evaluates semantic correctness and content quality:

1. Semantic HTML ratio for tables (3 points): - 90%+ of visual tables use proper <table>/<th>/<td>: 3/3 points - 70-89% semantic: 2/3 points - 50-69% semantic: 1/3 points - Below 50% or all tables built from divs: 0/3 points - No tables detected: score redistributed to list metrics

2. Semantic HTML ratio for lists (3 points): - 90%+ of visual lists use <ol>/<ul>/<li>: 3/3 points - 70-89% semantic: 2/3 points - 50-69% semantic: 1/3 points - Below 50% or lists built from divs/paragraphs: 0/3 points

3. Table quality (2 points): - Tables include <thead>/<tbody> AND <caption> or aria-label: 2/2 points - <thead>/<tbody> without caption: 1.5/2 points - <th> headers but no thead/tbody distinction: 1/2 points - No header cells: 0/2 points

4. Content substance (2 points): - Average table has 3+ columns and 4+ rows; average list has 4+ items with descriptions: 2/2 points - Tables 2+ columns, 3+ rows; lists 3+ items: 1.5/2 points - Mostly trivial tables (2x2) or very short lists (2 items): 1/2 points - Too simple for extraction value: 0/2 points

Deductions: - -1 point if more than 25% of tables have merged cells (colspan/rowspan) that break parsing - -0.5 points if list nesting exceeds 3 levels

Sites with modern component libraries often score poorly (3-5) because they render tables as styled divs. Sites using traditional HTML or server-rendered content typically score 6-9.

Resources

MDN: HTML `<table>` Element

developer.mozilla.org/en-US/docs/Web/HTML/Element/table

Google Rich Results Test

search.google.com/test/rich-results

Key Takeaways

Use native HTML <table>, <thead>, <th>, and <td> elements - not CSS Grid or Flexbox divs for tabular data.
Use <ol> and <ul> with <li> for lists - divs with CSS bullets are invisible as structured content to AI.
Add <caption> or aria-label to tables and use scope attributes on header cells for better parsing.
Keep tables substantive (3+ columns, 4+ rows) - trivially small tables provide minimal citation value.

How does your site score on this criterion?

Get a free AEO audit and see where you stand across all 10 criteria.

Written by

Alex Shortov

CTO of AEO Content, Inc. Building tools to help businesses get cited by AI answer engines.