Speakable Schema Markup
Voice assistants need to pick which paragraph to read aloud from your page. Without Speakable markup, they guess -and they frequently guess wrong, reading your cookie notice.
Questions this article answers
- ?What is Speakable schema and how does it help with voice search?
- ?How do I mark content for voice assistants like Google Assistant to read aloud?
- ?Is Speakable markup worth implementing for AI visibility in 2026?
Summarize This Article With AI
Open this article in your preferred AI engine for an instant summary and analysis.
Quick Answer
Speakable schema identifies content sections optimized for audio delivery by voice assistants. Adding Speakable markup to key paragraphs tells Google Assistant and other voice AI exactly which parts to read aloud. Fewer than 2% of publishers have this -that's your early-mover window.
Before & After
Before - No Speakable markup
<script type="application/ld+json">
{ "@type": "Article",
"headline": "What Is AEO?",
"author": "Alex Shortov" }
</script>
<!-- Voice assistant guesses which paragraph to read -->After - Speakable targets the quick answer
<script type="application/ld+json">
{ "@type": "Article",
"headline": "What Is AEO?",
"speakable": {
"@type": "SpeakableSpecification",
"cssSelector": [".quick-answer", ".article-lead"]
} }
</script>What This Actually Measures
We're evaluating whether your content includes Schema.org Speakable markup that tells voice assistants and text-to-speech systems which sections are suitable for audio playback. The Speakable spec lets publishers mark specific parts of a page -concise summaries, key definitions, direct answers -as the recommended sections for voice delivery.
Three aspects get measured: presence (do any pages include Speakable?), coverage (what percentage of voice-query-relevant pages include it?), and quality (do the speakable sections actually work as spoken content -concise, self-contained, free of visual-only references like "see the chart below"?).
The Speakable property is applied to Article, WebPage, or BlogPosting schema using CSS selectors or XPath expressions pointing to specific DOM elements. We validate that these selectors resolve to existing elements, that the targeted content is substantive (more than 20 words), and that sections don't contain tables, code blocks, or image references -content that's confusing when read aloud.
This is a conditional criterion -only scored for sites with voice query relevance. We determine relevance based on content type: news publishers, knowledge bases, FAQ-heavy sites, and local business sites with "near me" potential all qualify. Pure e-commerce catalogs and code-heavy technical docs may be excluded.
Why Voice Visibility Is an Untapped Channel
Voice search and voice-first AI interactions are growing as a content consumption channel. Google Assistant, Siri, Alexa, and standalone voice products all need to select which section of your page to vocalize. Without Speakable markup, they guess. We've seen voice assistants read navigation text, author bios, and cookie notices instead of the actual answer.
Speakable gives you editorial control over what voice assistants say when citing your content. You choose the sentences that represent your core message, ensuring the spoken version is accurate, concise, and compelling. Think of it as a meta description for voice -the meta description controls what appears visually in search results, Speakable controls what's spoken audibly.
At the site level, consistent Speakable implementation positions your domain as voice-optimized. As AI systems track which domains provide Speakable markup, they build a preference for those domains when handling voice queries. A domain where every article includes Speakable sections becomes a preferred voice source -similar to how sites with consistent FAQ schema become preferred for FAQ queries.
The competitive window is wide open. Fewer than 2% of news and article publishers implement Speakable as of early 2026. Early adopters who deploy it across their content pages gain voice visibility with almost zero competitive pressure. Start here: add Speakable to your top 10 articles targeting "What is..." or "How to..." queries.
How We Check This
First, we determine whether Speakable applies to the target site. We evaluate the content profile: what percentage of pages are article-format (blog posts, news, guides, knowledge base entries), whether the site serves informational queries (detected through content analysis and heading patterns), and whether the domain appears in voice search datasets. Sites with fewer than 20% informational articles may get "not applicable."
For applicable sites, we scan all JSON-LD blocks for Speakable properties. Speakable appears as a property of Article, WebPage, or BlogPosting schemas, with a SpeakableSpecification object (containing cssSelector or xpath properties) or an array of them.
When found, we validate each selector by rendering the page and resolving the CSS selector or XPath against the DOM. Selectors matching no elements are broken. Selectors matching fewer than 20 words are too short for meaningful voice delivery. Selectors matching tables, code blocks, or images are inappropriate targets.
We also evaluate the selected content's voice-readability: Does the text make sense without visual context? Is it self-contained? Is it concise enough for voice delivery (under 150 words per section)? Does it avoid references to visual elements ("as shown in the diagram," "click the button below")?
For pages without Speakable, we identify candidate sections that'd make strong targets -typically the first paragraph after the H1, Quick Answer boxes, and lead sentences under H2 headings. These get reported as recommendations for efficient implementation.
How We Score It
Speakable uses the "conditional criterion" framework -only scored when the site's content profile includes voice-relevant pages.
Applicability check: - Fewer than 20% article/informational pages: "not applicable" (no score impact) - 20%+ article/informational: criterion is scored
1. Speakable coverage (4 points): - 60%+ of article pages include valid Speakable markup: 4/4 points - 40-59%: 3/4 points - 20-39%: 2/4 points - 1-19% (a few pages): 1/4 points - No Speakable detected: 0/4 points
2. Selector validity (3 points): - All selectors resolve to existing, substantive DOM elements: 3/3 points - 80%+ resolve correctly: 2/3 points - 50-79% resolve: 1/3 points - Below 50% or selectors point to non-existent elements: 0/3 points
3. Content suitability for voice (3 points): - All sections pass voice-readability (self-contained, no visual references, under 150 words): 3/3 points - 80%+ pass: 2/3 points - 50-79% pass: 1/3 points - Below 50% or sections include tables/code/images: 0/3 points
Bonus: - +0.5 points if sections target Quick Answer boxes or summary paragraphs
Deductions: - -1 point if selectors target the entire page body (defeats the purpose) - -0.5 points if content exceeds 200 words per section (too long for voice)
Given current adoption, most sites score 0. Even basic implementation (Speakable on 20% of articles) earns 2-3 points.
Resources
Key Takeaways
- Speakable markup gives you editorial control over what voice assistants read aloud from your page.
- Target concise, self-contained paragraphs - avoid tables, code blocks, or references to visual elements.
- Fewer than 2% of publishers implement Speakable - the early-mover window is wide open.
- Start with your top 10 articles targeting "What is..." or "How to..." queries.
How does your site score on this criterion?
Get a free AEO audit and see where you stand across all 10 criteria.