Internationalization Signals: Why AI Engines Cite the Wrong Language Version
ISPsystem has 452 pages in 3 languages. Without hreflang, AI engines guess which version to cite. A Spanish user asks about VMmanager and gets an English answer - or worse, Russian. One tag set fixes it.
Part of the AEO scoring framework - the current 48 criteria that measure how ready a website is for AI-driven search across ChatGPT, Claude, Perplexity, and Google AIO.
Quick Answer
Internationalization signals - html lang attributes and hreflang alternate links - tell AI engines which language version of a page to cite. Without them, a multilingual site looks like a pile of duplicate content. Add lang to your html tag, hreflang links for each language variant, and an x-default fallback. Single-language sites just need the lang attribute.
Audit Note
In our audits, we've measured Internationalization Signals: Why AI Engines Cite the Wrong Language Version on live sites, we've compared implementations, and we've audited...
What are hreflang tags and why do they matter for AI visibility?
When a user asks ChatGPT a question in Spanish, the engine needs to know which version of your...
How do I implement hreflang for a multilingual website?
The Internationalization Signals criterion evaluates 5 components, each contributing to a 0-10 score: 1.
Does the html lang attribute affect AI engine citations?
For single-language sites, add the lang attribute to every page: <html lang="en"> For multilingual sites, add hreflang links...
Summarize This Article With AI
Open this article in your preferred AI engine for an instant summary and analysis.
Before & After
Before - No language signals
<html> <head> <!-- No lang attribute --> <!-- No hreflang links --> <!-- AI engines guess the language --> </head>
After - Proper internationalization
<html lang="en"> <head> <link rel="alternate" hreflang="en" href="https://example.com/" /> <link rel="alternate" hreflang="es" href="https://example.com/es/" /> <link rel="alternate" hreflang="ru" href="https://example.com/ru/" /> <link rel="alternate" hreflang="x-default" href="https://example.com/" /> </head>
Why AI Engines Need Language Signals
When a user asks ChatGPT a question in Spanish, the engine needs to know which version of your page to cite. Without hreflang tags, AI engines face a choice problem: your English page at /pricing, your Spanish page at /es/pricing, and your Russian page at /ru/pricing all look like separate pages about the same topic.
The worst outcome is not that AI ignores your site - it is that AI cites the wrong language version. A Spanish-speaking user gets an English quote. A Russian user sees pricing in a language they cannot read. The answer is technically correct but practically useless, and the user learns not to trust citations from your domain.
The html lang attribute is the baseline. It tells every crawler - AI or traditional - what language the page content is written in. Without it, engines rely on language detection heuristics that fail on short pages, mixed-language content, and technical documentation.
Hreflang links are the relationship layer. They tell AI engines: "This page exists in these other languages, and here are the URLs." This lets the engine choose the right version for the user's language context. Google has used hreflang for years in traditional search. AI engines inherit this signal because they build on the same crawl infrastructure.
What We Check and How Scoring Works
The Internationalization Signals criterion evaluates 5 components, each contributing to a 0-10 score:
1. HTML lang attribute (3 points): Does the page declare its language with a valid BCP-47 code? "en", "es", "ru", "zh-Hans" are valid. Empty strings, made-up codes, or missing attributes score zero.
2. Hreflang alternate links (3 points): Does the page link to its language variants? We look for at least 2 distinct hreflang values (e.g., "en" and "es"). A single hreflang is incomplete - you need at least one alternate to create a language relationship.
3. x-default hreflang (1 point): Is there a fallback language declared? The x-default hreflang tells engines which version to show when the user's language does not match any available translation.
4. Self-referencing hreflang (1 point): Does the page include a hreflang link pointing to itself? This confirms to engines that the page is the canonical version for its declared language.
5. Language-URL consistency (2 points): Do URL path prefixes match declared languages? A page at /ru/pricing should declare lang="ru", not lang="en". Mismatches confuse engines and create trust issues.
Single-language sites can still score well. A properly declared lang attribute (3 points) plus URL consistency (2 points) gives you 5/10 without any hreflang implementation. This is appropriate - hreflang is only relevant for sites with multiple language versions.
Implementation Guide
For single-language sites, add the lang attribute to every page:
<html lang="en">
For multilingual sites, add hreflang links in the <head> of every page. Each page must reference all its language variants including itself:
<link rel="alternate" hreflang="en" href="example.com/pricing" /> <link rel="alternate" hreflang="es" href="example.com/es/pricing" /> <link rel="alternate" hreflang="ru" href="example.com/ru/pricing" /> <link rel="alternate" hreflang="x-default" href="example.com/pricing" />
Key rules: - Every language version of a page must link to every other language version. - Every page must include a self-referencing hreflang. - Use consistent URL patterns. If English uses /pricing, Spanish should use /es/pricing, not /precios. - The x-default should point to your primary language version. - Use ISO 639-1 language codes (en, es, ru) or language-region combinations (en-US, pt-BR) for regional variants.
Common mistakes: - Missing self-referencing hreflang (engines cannot confirm the page belongs to the language set). - Inconsistent hreflang across language versions (English page links to Spanish, but Spanish page does not link back to English). - Using country codes instead of language codes (hreflang="us" is wrong, hreflang="en-US" is correct). - Forgetting x-default (engines have no fallback when the user's language is not available).
How AI Engines Use These Signals
ChatGPT inherits Bing's crawl data, which processes hreflang tags to determine language relationships between pages. When a user writes in Spanish, ChatGPT preferentially cites pages with hreflang="es" or lang="es" over English equivalents. Without hreflang, ChatGPT may cite your English page to a Spanish-speaking user because it appeared more authoritative in the crawl index.
Claude processes the lang attribute as part of its HTML parsing pipeline. When assembling an answer, Claude considers the declared language of source pages relative to the conversation language. Pages with explicit lang declarations get a relevance boost for matching-language queries.
Perplexity assembles answers from multiple sources and inherits search engine language signals. Pages with proper hreflang implementation appear as distinct language variants rather than duplicate content, which improves their chance of being cited in the correct language context.
Google AI Overviews uses hreflang extensively from its existing search infrastructure. Pages with proper internationalization signals are more likely to appear in AI Overviews for queries in matching languages, and less likely to be filtered as duplicate content.
External Resources
Key Takeaways
- Add lang="en" (or appropriate BCP-47 code) to your <html> tag on every page - this is the minimum signal.
- For multilingual sites, add <link rel="alternate" hreflang="xx"> for each language version including a self-referencing link.
- Always include hreflang="x-default" pointing to your primary language as the fallback.
- URL path prefixes must match declared language - /es/ pages should declare lang="es", not lang="en".
How does your site score on this criterion?
Get a free AEO audit and see where you stand across all 34 criteria.