HREFLANG for multiple country/language combinations

JohnnyECCO

We have a site setup with English, German, French, Spanish and Italian. We offer these languages for every European country (over 30). Thus, there are 150 + different URL combinations, as we use the /country/language/ subdirectory path.

Should I list out every combination in hreflang?Or should I simply choose the most applicable combinations (/de/de and fr/fr, etc.)? If we go the latter path, should I block google bot from crawling the atypical combinations?

Best,

Sam

JonoAlderson

Hi Sam,

Apologies for the slow response. Your question slipped through the net.

This is an interesting case!

In an ideal world, you'd specify the relationship between all of those pages, in each direction. That's 150+ tags per page, though, which is going to cause some headaches. Even if you shift the tagging to an XML sitemap, that's a _lot _of weight and processing.

Anecdotally, I know that hreflang tagging starts to break at those kinds of scales (even more so on large sites, at that kind of scale, when the resultant XML sitemaps can reach the size of many gigabytes, or when Google is crawling faster than it's processing the hreflang directives), and so tagging everything isn't going to be a viable approach.

I'd suggest picking out and implementing hreflang for _only _the primary combinations*, as you suggest, and reducing the site-wide mapping to the primary variant in each case.

You might consider that there may be cases where the valuable/primary combinations aren't just the /xx/xx/ or _/yy/yy/ _versions and that there might be some examples of varying country/language combinations which are worth including.

For the atypical variants, I think that you have a few options:

Use meta robots (or x-robots) tags to set noindex attributes. This will keep them out of the index, but doesn't guarantee that you're effectively managing/consolidating value across near duplicates - you may be quietly harming performance without realising it, as those pages represent points of crawl and value wastage/leakage.
Use robots.txt to prevent Google from accessing the atypical variants. That won't necessarily stop them from showing up in search results, though, and isn't without problems - you risk you creating crawl dead-ends, writing off the value of any inbound links to those pages, and other issues.
You use canonical URLs on all of the atypical variations, referencing the nearest primary version, to attempt to consolidate value/relevance etc. However, that risks the wrong language/content showing up in the wrong country, as you're explicitly _un_optimising the location component.

I think that #1 is the best approach, as per your thinking. That removes the requirement to do anything clever or manipulative with hreflang tagging, and fits neatly with the idea that the atypical combinations aren't useful/valuable enough to warrant their own identities - Google should be smart enough to fall back to the nearest 'generic' equivalent.

I'd also take care to set up your Google Search Console country targeting for each country-level folder, to reduce the risk of people ending up in the wrong sections.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

HREFLANG for multiple country/language combinations

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Using the canonical tag across multiple domains...

Searches by country

Google Multiple Results

Panda / Penguin Behavior ? Recovery?

Google has indexed a lot of test pages/junk from the development days.

Indexing well in Google but not in Yahoo/Bing - WHY?

Gifts.com - Multiple domain pages in SERPs

Risks associated with having multiple similar ecom sites together under the same analytics account?