AI search doesn’t just translate or localize results. It decides which sources, institutions, and versions of reality get surfaced in the first place.
Catalonia offers a useful stress test for that system. Two languages share the same geography, which makes retrieval patterns easier to spot.
When the same queries are run in Catalan and Spanish across Google AI Overviews and ChatGPT, the differences go far beyond wording — and reveal broader problems that extend well beyond multilingual regions.
Catalonia as a stress test for AI search
Did you know that if you search for Tradicions de Sant Jordi — Saint George’s Traditions, written in Catalan — Google Translate will identify the source language as Occitan?
Probably not. Most Catalan speakers don’t know it either, partly because Translate’s language guess isn’t exactly wrong: Catalan and Occitan share a common Romance ancestry, and some classification systems group them together.
The answer is technically defensible. It’s also, statistically, an odd call — and the kind of small anecdote that points at a much larger problem in the infrastructure underneath.


Occitan has roughly 200,000 speakers, mostly in southern France. Catalan has roughly 9 million speakers and is the co-official language of Catalonia, one of Europe’s wealthier regions and home to a city Google has operated in for over 20 years.
Asked from a Barcelona IP, Google’s translation product decides that the more plausible source language is the one with more than an order of magnitude fewer speakers, in another country. Translate then renders Sant Jordi into Spanish as San Jorge — castilianizing the proper name of the patron saint of Catalonia, a name that doesn’t need translating in the first place.
This single Translate quirk is anecdotal. What it points at isn’t. It’s a language-identification problem that has lived inside Google’s infrastructure for years — and Google itself has publicly acknowledged it.
In January 2023, the company’s Search Liaison account responded to a wave of complaints from Catalan-speaking users about Catalan results being downgraded in favor of Spanish ones. Google called the issue “a priority” and committed to keep investigating. The acknowledgment was even posted in Catalan — a tacit admission that the affected audience was real and large enough to warrant a direct response.
Google later pushed updates that year that measurably improved Catalan visibility in classical SERPs. But the underlying language-identification layer was never structurally repaired. When a Catalan speaker today watches Google’s AI Overview answer a Catalan-language query in Spanish, it isn’t a new bug. It’s an old bug now sitting underneath a synthesis layer that propagates it.
AI search, when it…
Source link
Disclaimer
We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.
Website Upgradation is going on for any glitch kindly connect at [email protected]