Document Weight: How Heavy Is Your Page for AI to Lift?
A 100KB HTML page with 600 DOM nodes loads fast for AI crawlers. A 500KB page with 2,000 DOM nodes and 100KB of inline CSS/JS is a burden. Document Weight measures the overall heft of your HTML document - total size, DOM complexity, inline code, and embedded payloads - to determine how efficiently AI can process it.
Part of the AEO scoring framework - the current 48 criteria that measure how ready a website is for AI-driven search across ChatGPT, Claude, Perplexity, and Google AIO.
Quick Answer
Keep total HTML under 100KB (perfect) or 250KB (good). Keep DOM nodes under 600. Keep combined inline CSS and JavaScript under 20KB. Eliminate inline payloads (style/script blocks) over 25KB. This criterion (1% weight, Technical Foundation pillar) measures four dimensions of page weight that affect how quickly and completely AI crawlers can process your content.
Audit Note
In our audits, we've measured Document Weight: How Heavy Is Your Page for AI to Lift? on live sites, we've compared implementations, and we've...
What is Document Weight and how does it differ from page load speed?
Document Weight measures four dimensions of how heavy your HTML document is for AI crawlers to process: 1.
How many DOM nodes is too many for AI crawlers?
The scorer calculates four sub-scores: **Total HTML (0-4 points):** - Under 100KB: 4 points - 100-250KB: 3 points...
What counts as a "large blob" in the Document Weight scorer?
**Step 1: Externalize CSS and JavaScript** This is the single most impactful fix for Document Weight.
Summarize This Article With AI
Open this article in your preferred AI engine for an instant summary and analysis.
Four sub-scores combined (0-10)
Before & After
Before - Heavy document (680KB, 1800 nodes)
<html> <!-- 1,800 DOM nodes --> <style>/* 45KB inline Tailwind */</style> <style>/* 35KB inline component styles */</style> <script>/* 90KB inline Next.js state */</script> <!-- Total: 680KB HTML -->
After - Lean document (95KB, 550 nodes)
<html>
<!-- 550 DOM nodes -->
<link rel="stylesheet" href="/styles.css">
<script src="/app.js" defer></script>
<script type="application/ld+json">
{"@type":"Organization",...}
</script>
<!-- Total: 95KB HTML -->What Is Document Weight?
Document Weight measures four dimensions of how heavy your HTML document is for AI crawlers to process:
1. Total HTML size: The raw byte count of the entire HTML response. This overlaps with Response Efficiency but contributes differently to the sub-score. 2. DOM node count: The number of HTML elements in the document. More nodes means more parsing work for crawlers. 3. Inline CSS + JS size: The combined byte count of all inline style and script blocks (excluding JSON-LD). 4. Large blob count: The number of individual inline style or script blocks that exceed 25KB.
The distinction from Response Efficiency is granularity. Response Efficiency looks at the total payload size. Document Weight breaks down where that weight comes from. A 300KB page could have 300KB of content (fine) or 30KB of content with 270KB of inline CSS/JS (bad). Document Weight catches the second case by penalizing inline bloat and DOM complexity separately.
How Does the Scorer Work?
The scorer calculates four sub-scores:
Total HTML (0-4 points): - Under 100KB: 4 points - 100-250KB: 3 points - 250-500KB: 2 points - 500-800KB: 1 point - Over 800KB: 0 points
DOM nodes (0-2 points): The scorer counts opening HTML tags using a regex pattern. This approximates DOM complexity. - Under 600 nodes: 2 points - 600-1200 nodes: 1 point - Over 1200 nodes: 0 points
Inline CSS + JS (0-2 points):
The scorer extracts all <style> blocks and all <script> blocks (excluding those with type="application/ld+json") and sums their byte counts.
- Under 20KB combined: 2 points
- 20-80KB combined: 1 point
- Over 80KB combined: 0 points
Large blobs (0-2 points): Any single inline style or script block over 25KB is a "large blob." These are individual payloads that significantly inflate the document. - 0 large blobs: 2 points - 1 large blob: 1 point - 2+ large blobs: 0 points
Maximum score: 10 (4+2+2+2).
How Do You Reduce Document Weight?
Step 1: Externalize CSS and JavaScript
This is the single most impactful fix for Document Weight. Move all inline styles and scripts to external files. This reduces three sub-scores simultaneously: total HTML size, inline CSS/JS, and large blob count.
```html <!-- Heavy: 80KB inline --> <style>/* 45KB Tailwind utilities */</style> <script>/* 35KB app initialization */</script>
<!-- Light: 0KB inline, loaded externally --> <link rel="stylesheet" href="/styles.css"> <script src="/app.js" defer></script> ```
Step 2: Reduce DOM complexity
Pages with deeply nested component structures (common in React/Vue apps) generate excessive DOM nodes. Flatten unnecessary wrapper divs:
```html <!-- Unnecessary nesting: 4 nodes for one text --> <div class="wrapper"> <div class="container"> <div class="inner"> <p>Actual content</p> </div> </div> </div>
<!-- Flat: 1 node --> <p class="content">Actual content</p> ```
Step 3: Paginate long content
Pages with 50+ FAQ items, 100+ product listings, or very long articles generate thousands of DOM nodes. Consider paginating, using "load more" patterns, or splitting into multiple pages.
Step 4: Audit framework overhead
Server-side rendered frameworks often inject large state blobs. Check for __NEXT_DATA__, __NUXT__, or similar patterns. Reduce the data passed through server props to the minimum needed for initial render.
Step 5: Audit third-party embeds
Chat widgets, analytics dashboards, and social media embeds often inject large inline scripts. Move these to external loading patterns or lazy-load them after the main content renders.
Score Impact in Practice
Document Weight carries 1% weight in the Technical Foundation pillar. Combined with Response Efficiency (1%) and Critical Path Efficiency (1%), the Page Speed trio contributes 3% to the total score. These three criteria share an overlap group, meaning the fix plan consolidates their recommendations.
Most content-focused sites with external CSS/JS score 7-10/10. The failures concentrate in:
- SPA frameworks that inline large state objects: these inject 50-200KB of JSON into the HTML
- CSS-in-JS libraries that embed framework-generated styles: can add 40-100KB of inline CSS
- E-commerce pages with hundreds of product cards: DOM counts of 2000+ are common
- Long-form pages with embedded media and complex layouts: DOM complexity scales with content length
The most efficient fix path is externalizing inline CSS and JS. A single change (moving styles to an external file) can improve all three sub-scores: total HTML drops, inline CSS/JS drops, and large blob count drops.
How AI Engines Evaluate This
AI crawlers parse HTML to extract content. The content-to-markup ratio determines how efficiently a crawler can find the useful parts of your page. A page where 90% of the HTML is actual content parses faster and more completely than a page where 90% is framework overhead.
GPTBot processes HTML by stripping markup and extracting text content. Pages with high DOM complexity and large inline payloads require more stripping work. While GPTBot handles this efficiently for individual pages, when crawling thousands of pages on a site, the overhead compounds.
ClaudeBot evaluates content density as a quality signal. Pages with a high content-to-markup ratio indicate content-first architecture, which correlates with higher content quality. Pages where the actual text content is a small fraction of the total HTML suggest framework-first architecture with content as an afterthought.
Perplexity's real-time processing is most sensitive to document weight because it processes multiple pages concurrently under time constraints. Lighter documents get processed more completely, meaning Perplexity is more likely to find and cite the specific passage that answers the user's query.
Google's crawling infrastructure handles document weight efficiently but still applies crawl budget optimization. Lighter pages consume less crawl budget, allowing Google to crawl more pages on your site and keep its index fresher.
External Resources
Key Takeaways
- Total HTML size: under 100KB scores 4/4, under 250KB scores 3/4, under 500KB scores 2/4, under 800KB scores 1/4.
- DOM nodes: under 600 scores 2/2, under 1200 scores 1/2. Over 1200 scores 0.
- Inline CSS + JS combined: under 20KB scores 2/2, under 80KB scores 1/2. Over 80KB scores 0.
- Large blob penalty: any single inline style or script block over 25KB counts as a large blob. Zero blobs = 2/2, one blob = 1/2.
- JSON-LD script blocks are excluded from the inline JS and blob calculations - structured data does not penalize you.
How does your site score on this criterion?
Get a free AEO audit and see where you stand across all 34 criteria.