Platform

AEO Website Research-grade Content Content Factory About Audits Rankings Pricing

Resources

Knowledge Base Research Docs FAQ
AEO Scoring Criteria Criterion #413

Document Weight: How Heavy Is Your Page for AI to Lift?

A 100KB HTML page with 600 DOM nodes loads fast for AI crawlers. A 500KB page with 2,000 DOM nodes and 100KB of inline CSS/JS is a burden. Document Weight measures the overall heft of your HTML document - total size, DOM complexity, inline code, and embedded payloads - to determine how efficiently AI can process it.

One of 53 criteria in AEO Rank, the citation-readiness score we run against every site we audit.

By Alex Shortov

medium effort low impact

Quick Answer

Keep total HTML under 100KB (perfect) or 250KB (good). Keep DOM nodes under 600. Keep combined inline CSS and JavaScript under 20KB. Eliminate inline payloads (style/script blocks) over 25KB. This criterion (1% weight, Technical Foundation pillar) measures four dimensions of page weight that affect how quickly and completely AI crawlers can process your content.

Audit Note

In our audits, we've measured Document Weight: How Heavy Is Your Page for AI to Lift? on live sites, we've compared implementations, and we've audited the gaps that keep scores low.

What is Document Weight and how does it differ from page load speed?

Document Weight measures total HTML size, DOM nodes, inline CSS and JS, and large blob count, while page speed only tracks load time and rendering.

How many DOM nodes is too many for AI crawlers?

Keep DOM nodes under 600 for a full score and under 1,200 for partial credit; anything above 1,200 nodes signals bloat that AI crawlers struggle to parse.

What counts as a "large blob" in the Document Weight scorer?

A large blob is any single inline style or script block over 25 kilobytes, and even one blob costs you a point on the four-part Document Weight score.

Summarize This Article With AI

Open this article in your preferred AI engine for an instant summary and analysis.

Document Weight Scoring
+4 <=100KB HTML Lean document - optimal for AI processing
+2 <=600 DOM nodes Simple structure - easy to parse
+2 <=20KB inline CSS/JS Minimal embedded code
+2 0 large blobs No oversized inline payloads (25KB+ blocks)
aeocontent.ai
Four sub-scores combined (0-10)

What this article answers

  • What is Document Weight and how does it differ from page load speed?
  • How many DOM nodes is too many for AI crawlers?
  • What counts as a “large blob” in the Document Weight scorer?

Key takeaways

  • Total HTML size: under 100KB scores 4/4, under 250KB scores 3/4, under 500KB scores 2/4, under 800KB scores 1/4.
  • DOM nodes: under 600 scores 2/2, under 1200 scores 1/2. Over 1200 scores 0.
  • Inline CSS + JS combined: under 20KB scores 2/2, under 80KB scores 1/2. Over 80KB scores 0.
  • Large blob penalty: any single inline style or script block over 25KB counts as a large blob. Zero blobs = 2/2, one blob = 1/2.
  • JSON-LD script blocks are excluded from the inline JS and blob calculations - structured data does not penalize you.

What Is Document Weight?

Document Weight measures four dimensions of how heavy your HTML document is for AI crawlers to process:

  1. Total HTML size: The raw byte count of the entire HTML response. This overlaps with Response Efficiency but contributes differently to the sub-score.
  2. DOM node count: The number of HTML elements in the document. More nodes means more parsing work for crawlers.
  3. Inline CSS + JS size: The combined byte count of all inline style and script blocks (excluding JSON-LD).
  4. Large blob count: The number of individual inline style or script blocks that exceed 25KB.

The distinction from Response Efficiency is granularity. Response Efficiency looks at the total payload size. Document Weight breaks down where that weight comes from. A 300KB page could have 300KB of content (fine) or 30KB of content with 270KB of inline CSS/JS (bad). Document Weight catches the second case by penalizing inline bloat and DOM complexity separately.

Document weight measures how much HTML an AI crawler must parse to reach your actual content.

Page WeightCrawler BehaviorAEO Rank Impact
Under 100 KBFull parse, every pageTop score
100-300 KBReliable parseStrong score
300-700 KBPartial parse, may skip embedded contentModerate score
Over 700 KBFrequent timeoutsSevere penalty

How Does the Scorer Work?

The scorer assigns sub-scores for total HTML (0-4 points), DOM node count (0-2 points), combined inline CSS plus JS (0-2 points), and large blobs (0-2 points).

The scorer calculates four sub-scores:

Total HTML (0-4 points):

  • Under 100KB: 4 points
  • 100-250KB: 3 points
  • 250-500KB: 2 points
  • 500-800KB: 1 point
  • Over 800KB: 0 points

DOM nodes (0-2 points): The scorer counts opening HTML tags using a regex pattern. This approximates DOM complexity.

  • Under 600 nodes: 2 points
  • 600-1200 nodes: 1 point
  • Over 1200 nodes: 0 points

Inline CSS + JS (0-2 points): The scorer extracts all <style> blocks and all <script> blocks (excluding those with type=“application/ld+json”) and sums their byte counts.

  • Under 20KB combined: 2 points
  • 20-80KB combined: 1 point
  • Over 80KB combined: 0 points

Large blobs (0-2 points): Any single inline style or script block over 25KB is a “large blob.” These are individual payloads that significantly inflate the document.

  • 0 large blobs: 2 points
  • 1 large blob: 1 point
  • 2+ large blobs: 0 points

Maximum score: 10 (4+2+2+2).

How Do You Reduce Document Weight?

Externalize CSS and JavaScript, flatten unnecessary wrapper divs, paginate long content with 50-plus items, then audit framework overhead injecting inline state into your HTML.

Step 1: Externalize CSS and JavaScript

This is the single most impactful fix for Document Weight. Move all inline styles and scripts to external files. This reduces three sub-scores simultaneously: total HTML size, inline CSS/JS, and large blob count.

<!-- Heavy: 80KB inline -->
<style>/* 45KB Tailwind utilities */</style>
<script>/* 35KB app initialization */</script>

<!-- Light: 0KB inline, loaded externally -->
<link rel="stylesheet" href="/styles.css">
<script src="/app.js" defer></script>

Step 2: Reduce DOM complexity

Pages with deeply nested component structures (common in React/Vue apps) generate excessive DOM nodes. Flatten unnecessary wrapper divs:

<!-- Unnecessary nesting: 4 nodes for one text -->
<div class="wrapper">
  <div class="container">
    <div class="inner">
      <p>Actual content</p>
    </div>
  </div>
</div>

<!-- Flat: 1 node -->
<p class="content">Actual content</p>

Step 3: Paginate long content

Pages with 50+ FAQ items, 100+ product listings, or very long articles generate thousands of DOM nodes. Consider paginating, using “load more” patterns, or splitting into multiple pages.

Step 4: Audit framework overhead

Server-side rendered frameworks often inject large state blobs. Check for __NEXT_DATA__, __NUXT__, or similar patterns. Reduce the data passed through server props to the minimum needed for initial render.

Step 5: Audit third-party embeds

Chat widgets, analytics dashboards, and social media embeds often inject large inline scripts. Move these to external loading patterns or lazy-load them after the main content renders.

Score Impact in Practice

Document Weight carries 1% weight but joins the Page Speed trio overlap group, so SPA frameworks inlining state and CSS-in-JS libraries often drop sites into the 4-6 range.

Document Weight carries 1% weight in the Technical Foundation pillar. Combined with Response Efficiency (1%) and Critical Path Efficiency (1%), the Page Speed trio contributes 3% to the total score. These three criteria share an overlap group, meaning the fix plan consolidates their recommendations.

Most content-focused sites with external CSS/JS score 7-10/10. The failures concentrate in:

  • SPA frameworks that inline large state objects: these inject 50-200KB of JSON into the HTML
  • CSS-in-JS libraries that embed framework-generated styles: can add 40-100KB of inline CSS
  • E-commerce pages with hundreds of product cards: DOM counts of 2000+ are common
  • Long-form pages with embedded media and complex layouts: DOM complexity scales with content length

The most efficient fix path is externalizing inline CSS and JS. A single change (moving styles to an external file) can improve all three sub-scores: total HTML drops, inline CSS/JS drops, and large blob count drops.

How AI Engines Evaluate This

GPTBot strips markup before extracting text, ClaudeBot weighs content-to-markup ratio as a quality signal, and PerplexityBot’s tight crawl budget punishes heavy DOM and inline payloads.

AI crawlers parse HTML to extract content. The content-to-markup ratio determines how efficiently a crawler can find the useful parts of your page. A page where 90% of the HTML is actual content parses faster and more completely than a page where 90% is framework overhead.

GPTBot processes HTML by stripping markup and extracting text content. Pages with high DOM complexity and large inline payloads require more stripping work. While GPTBot handles this efficiently for individual pages, when crawling thousands of pages on a site, the overhead compounds.

ClaudeBot evaluates content density as a quality signal. Pages with a high content-to-markup ratio indicate content-first architecture, which correlates with higher content quality. Pages where the actual text content is a small fraction of the total HTML suggest framework-first architecture with content as an afterthought.

Perplexity’s real-time processing is most sensitive to document weight because it processes multiple pages concurrently under time constraints. Lighter documents get processed more completely, meaning Perplexity is more likely to find and cite the specific passage that answers the user’s query.

Google’s crawling infrastructure handles document weight efficiently but still applies crawl budget optimization. Lighter pages consume less crawl budget, allowing Google to crawl more pages on your site and keep its index fresher.

External Resources

Key takeaways

  • Total HTML size: under 100KB scores 4/4, under 250KB scores 3/4, under 500KB scores 2/4, under 800KB scores 1/4.
  • DOM nodes: under 600 scores 2/2, under 1200 scores 1/2. Over 1200 scores 0.
  • Inline CSS + JS combined: under 20KB scores 2/2, under 80KB scores 1/2. Over 80KB scores 0.
  • Large blob penalty: any single inline style or script block over 25KB counts as a large blob. Zero blobs = 2/2, one blob = 1/2.
  • JSON-LD script blocks are excluded from the inline JS and blob calculations - structured data does not penalize you.

Related FAQs

Technical Implementation
Technical Audit Criteria