HREFLANG for multiple country/language combinations
-
We have a site setup with English, German, French, Spanish and Italian. We offer these languages for every European country (over 30). Thus, there are 150 + different URL combinations, as we use the /country/language/ subdirectory path.
Should I list out every combination in hreflang?Or should I simply choose the most applicable combinations (/de/de and fr/fr, etc.)? If we go the latter path, should I block google bot from crawling the atypical combinations?
Best,
Sam
-
Hi Sam,
Apologies for the slow response. Your question slipped through the net.
This is an interesting case!
In an ideal world, you'd specify the relationship between all of those pages, in each direction. That's 150+ tags per page, though, which is going to cause some headaches. Even if you shift the tagging to an XML sitemap, that's a _lot _of weight and processing.
Anecdotally, I know that hreflang tagging starts to break at those kinds of scales (even more so on large sites, at that kind of scale, when the resultant XML sitemaps can reach the size of many gigabytes, or when Google is crawling faster than it's processing the hreflang directives), and so tagging everything isn't going to be a viable approach.
I'd suggest picking out and implementing hreflang for _only _the primary combinations*, as you suggest, and reducing the site-wide mapping to the primary variant in each case.
- You might consider that there may be cases where the valuable/primary combinations aren't just the /xx/xx/ or _/yy/yy/ _versions and that there might be some examples of varying country/language combinations which are worth including.
For the atypical variants, I think that you have a few options:
-
Use meta robots (or x-robots) tags to set noindex attributes. This will keep them out of the index, but doesn't guarantee that you're effectively managing/consolidating value across near duplicates - you may be quietly harming performance without realising it, as those pages represent points of crawl and value wastage/leakage.
-
Use robots.txt to prevent Google from accessing the atypical variants. That won't necessarily stop them from showing up in search results, though, and isn't without problems - you risk you creating crawl dead-ends, writing off the value of any inbound links to those pages, and other issues.
-
You use canonical URLs on all of the atypical variations, referencing the nearest primary version, to attempt to consolidate value/relevance etc. However, that risks the wrong language/content showing up in the wrong country, as you're explicitly _un_optimising the location component.
I think that #1 is the best approach, as per your thinking. That removes the requirement to do anything clever or manipulative with hreflang tagging, and fits neatly with the idea that the atypical combinations aren't useful/valuable enough to warrant their own identities - Google should be smart enough to fall back to the nearest 'generic' equivalent.
I'd also take care to set up your Google Search Console country targeting for each country-level folder, to reduce the risk of people ending up in the wrong sections.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using the canonical tag across multiple domains...
Hi guys I am looking for some help in regards to using canonical tags in other domains that have similar content to our main site. Would this be the right way to go about it? For example www.main.com is the website i would like to achieve best ranking with, but i also have other websites, www.secondary.com and www.somethingelse.com which have similar content and all link back to www.main.com So in order to make sure the google bot knows these other pages are a reference to the main.com page can i put a canonical tag in secondary.com that goes like this: rel="canonical" href="www.main.com" /> and put that same tag in somethingelse.com Would i achieve a better ranking for doing so on main.com or am i on the wrong track and will doing so not change a thing? I hope I'm making sense 😉 Best regards, Manny
Algorithm Updates | | Manny20000 -
Searches by country
Hello seomoz.org has tool to find the most searches by country. If not, could you please tell me a good tool I live in Costa Rica and I would like to have this information Thank you Andy
Algorithm Updates | | newsmile0 -
Google Multiple Results
With Google's penchant for listing at times many results - one on top of the other - from the same domain, is it now advisable to not worry about having multiple pages in the same site targeting the same or very similar keywords? Is this (keyword/page internal competition) one less thing that I have to worry about or worry about less or what? Thanks! Best... Jane
Algorithm Updates | | 945010 -
Panda / Penguin Behavior ? Recovery?
Our site took a major fall on March 23rd, ie Panda 3.4 and then another smaller one on April 24th, ie Penguin. I have posted a few times in here trying get help on what items to focus on. Been doing this for 13 years, white hat, never chased algos but of course learned as I went. As soon as the fall hit one expert said it was links, which I kinda doubted because we never went after them but we have some but only a handful in comparison to really good authorative links. I concentrated on cleaning up duplicate content due to tags in a blog that only had 7 posts (an add on section to the site) then focuses efforts on just going through and making content better. Had other overlapping content that I would guess would pass inspection but I cleaned it up. After 6 weeks no movement back up, another expert here said yes, he saw some bad links so I should check it out. So back to focusing on links, I actually run a report and discover questionable links, and successfully get about 25 removed. Low numbers but we have only about 50 that were questionable. No contact info on the other directories so I guess we are stuck. Here is where I just go in circles... When our site fell on March 23rd we had 13 of our main pages still ranking at number 1 and 2 on each keyword phrase. Penguin hit and they fell about 10 spots. EXCEPT, one... This one keyword phrase and page stayed on top and ranked at #1 throught he storm. (finally fell to #4 but still remains up there). The whole site is down 90%, we only have 3 fair keyword phrases really ranking out of 250. The mystery is that the keyword phrase that was ranking was the one that supposedly had way over the % of anchor text, 7% of our links go to that page. The other pages that fell on Penguin had no pages linking back. I have been adding blog posts to our site, I post one an in a few days it gets indexed, have one of those ranking at #2 for the keyword, moved up from #4 a week after posting it in the blog. (google searches shows 80K) Just seems like the site should bounce back if new content is able to rank, why not the old? Did other people hit by Panda and Penguin see a sitewide fall or are they still ranking for some terms? I would love to see some discusson on success stories of bouncing back after Panda and Penguin. I see the WP success story but that was pretty sudden after it was brought to Google's attention. Looking for that small business that fixed something and saw improvement. Give me hope here please.
Algorithm Updates | | Force70 -
Google has indexed a lot of test pages/junk from the development days.
With hind site I understand that this could have been avoided if robots.txt was configured properly. My website is www.clearvisas.com, and is indexed with both the www subdomain and with out. When I run site:clearvisas.com in Google I get 1,330 - All junk from the development days. But when I run site:www.clearvisas.com in Google I get 66 - these results all post development and more in line with what I wanted to be indexed. Will 1,330 junk pages hurt my seo? Is it possible to de-index them and should I? If the answer is yes to any of the questions how should I proceed? Kind regards, Fuad
Algorithm Updates | | Fuad_YK0 -
Indexing well in Google but not in Yahoo/Bing - WHY?
Been using SEOMOZ now to analyze and crawl a client's website for a while now. One thing I've noticed is that our client's website is indexing well with Google. a few thousand pages are being indexed. However, when it comes to Yahoo and Bing, the website only has a 100+ pages indexed. We've submitted updated sitemaps to Google and Bing and have been fixing any broken links, and on-page SEO. Content is also good. Here's the website: www.imaginet.com.ph Any suggestions/recommendations are highly appreciated. Thank you!
Algorithm Updates | | TheNorthernOffice790 -
Gifts.com - Multiple domain pages in SERPs
One of our big natural search competitors for gift keywords is Gifts.com. We are competing for many keywords like "teen gifts", "gifts for him", "gifts for her". For many of these, the Google SERP has multiple Gifts.com pages on the first page. I have never seen more than one of our pages (uncommongoods.com) on a SERP page. Any clue how/why Gifts.com has multiple pages in search results ? Thanks!
Algorithm Updates | | znotes0 -
Risks associated with having multiple similar ecom sites together under the same analytics account?
Any downsides to having multiple (similar) eCommerce sites linked to the same Google Analytics account? Traffic splitting or other penalties? I've heard a range of answers from "Yes, traffic was split between my two first-page ranked sites, it was awful" to "no, Google couldn't care less/ they'd be able to tell if your sites were related outside of having them in the same account anyways" Any info would be much apprecaited 🙂 Thanks!
Algorithm Updates | | apo11o1770