Multilingual site with untranslated content
-
We are developing a site that will have several languages.
There will be several thousand pages, the default language will be English. Several sections of the site will not be translated at first, so the main content will be in English but navigation/boilerplate will be translated.
We have hreflang alternate tags set up for each individual page pointing to each of the other languages, eg in the English version we have:
etc
In the spanish version, we would point to the french version and the english version etc.
My question is, is this sufficient to avoid a duplicate content penalty for google for the untranslated pages?
I am aware that from a user perspective, having untranslated content is bad, but in this case it is unavoidable at first.
-
Thanks for your comments Gianluca.
I think Google's guidelines are somewhat ambiguous. Here it does state that "if you're providing the same content to the same users on different URLs (for instance, if both example.de/ and example.com/de/ show German language content for users in Germany), you should pick a preferred version and redirect (or use the rel=canonical link element) appropriately."
https://support.google.com/webmasters/answer/182192?hl=en
I think you've explained it nicely though.
-
At first that would be fine.
Said that, this is a very specific case where you can use both hreflang and cross domain rel="canonical".
Remember that these two mark-up are totally independent one each other, though.
If you use them both, as I wrote replying to Yusuf, from one side you are telling Google that you want it to show a determined URL for a determined geo-targeted country/language, and from other side you are also telling Google that that geo-targeted URL is the exact copy of the canonical one.
What Google will do will be showing the geo-targeted URL in the SERPs, but with the Title and Meta Description of the canonical one.
One more thing, and this a strong reason for urging a complete translation in a short period of time:
if the content of the URL of the French site, for instance, is in English, you cannot put "fr-FR" in the hreflang, but "en-FR". This is a consequence: that the URL will tend to be shown only for English queries done in Google.fr, not for French queries... and that mean loosing a lot of traffic opportunities.
-
Yusuf,
I'm sorry but I've to correct you.
If two pages are in the same language, but they are targeting different countries (i.e.: USA and UK), even if the content is the same or substantially the same, then you not only can use the hreflang, but also you should use it in order to tell Google that one URL must be shown to US people and the other to UK ones.
Obviously, if you want you can always decide to use the cross domain rel="canonical" instead.
Remember, though, that in that case - if you are using the hreflang - that Google will show the snippets' components (title and meta description) of the canonical URL, even it will show the geotargeted URL. Instead, if you opted to not use the hreflang, people will see the canonical URL snippet (web address included).
-
Have you taken a look through the following :
https://support.google.com/webmasters/answer/182192?hl=en#1
https://sites.google.com/site/webmasterhelpforum/en/faq-internationalisation
"
Duplicate content and international sites
Websites that provide content for different regions and in different languages sometimes create content that is the same or similar but available on different URLs. This is generally not a problem as long as the content is for different users in different countries. While we strongly recommend that you provide unique content for each different group of users, we understand that this may not always be possible. There is generally no need to "hide" the duplicates by disallowing crawling in a robots.txt file or by using a "noindex" robots meta tag. However, if you're providing the same content to the same users on different URLs (for instance, if both
example.de/
andexample.com/de/
show German language content for users in Germany), you should pick a preferred version and redirect (or use the rel=canonical link element) appropriately. In addition, you should follow the guidelines on rel-alternate-hreflang to make sure that the correct language or regional URL is served to searchers." -
Hi Jorge
The rel="alternate" hreflang="x" tag is not suitable for pages that are in the same language as these are essentially duplicates rather than alternative language versions.
I'd use the rel="canonical" tag to point to the main page until the translations of those pages are available.
Webmaster Tools should allow you to see any issues.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
IP Canonicalization for HTTPS site?
I received an unsolicited SEO report for one of my sites. My site was faulted for not having IP canonicalization set up. I reviewed this carefully. My site runs on Apache, is https and is on a dedicated IP. The mod rewrite rules for Apache all deal with the http version of the site. When I type my site's IP into a browser, I get the the https version, but with a unsecure cert warning as the certificate does not include the IP. Should I implement the http IP canonicalization rule. Another rewrite rule would then redirect the request to the https version?
On-Page Optimization | | FatRodent20130 -
Will it make any seo impact if there is not any Content available in Mobile responsive site
If we will not show our existing content(Available for desktop views) on product listing page in Mobile responsive site, will it make any difference/impact from seo point of view? Mostly people don't read content in product listing pages and we place content mostly for seo point of view. So intentionaly we don't want to show our content in mobile view bcz it may distract users. As per my knowledge google using same ranking algorithm for both desktop and mobile. So I want to know will it make any impact from ranking and other seo factors?
On-Page Optimization | | kathiravan0 -
Fading in content above the fold on window load
Hi, We'd like to render a font stack from Typekit and paint a large cover image above the fold of our homepage after document completion. Since asynchronously loading anything generally looks choppy, we fade in the affected elements when it's done. Sure, it gives a much smoother feeling and fast load times, but I have a concern about SEO. While Typekit loads, h1, h2 and the page's leading paragraph are sent down the wire with an invisible style (but still technically exist as static html). Even though they appear to a user only milliseconds later, I'm concerned that a search engine's initial request is met with a page whose best descriptive assets are marked as invisible. Both UX and SEO have high value to our business model, so we're asking for some perspective to make the right kind of trade off. Our site has a high domain authority compared to our competition, and sales keyword competition is high. Will this UX improvement damage our On-Page SEO? If so and purely from an SEO perspective, roughly how serious will the impact be? We're eager to hear any advice or comments on this. Thanks a lot.
On-Page Optimization | | noyelling0 -
Does Widgetised Content Index The Same As A Regular Page
Hi, We have a website that was built in my opinion bizarrely where the bottom half of the page where most of the content is, is a widget. I just wondered if the content being in a widget is indexed any differently. I ask as normal pages seem to index and rank much better than the wordpress template using the widget. Hope someone might be able to clarify this. Thanks
On-Page Optimization | | denismilton0 -
Should I remove the Jetpack Plugin From A SIte
I dont know if anyone has any experience with the jetpack plugin, but personally I prefer yoast. My point is someones site I am looking at has both Yoast SEO plugin and also Jetpack for wordpress, should I just remove the jetpack as it seems to be a very heavy loading plugin.
On-Page Optimization | | propertyhunter0 -
Dealing with thin content/95% duplicate content - canonical vs 301 vs noindex
My client's got 14 physical locations around the country but has a webpage for each "service area" they operate in. They have a Croydon location. But a separate page for London, Croydon, Essex, Luton, Stevenage and many other places (areas near Croydon) that the Croydon location serves. Each of these pages is a near duplicate of the Croydon page with the word Croydon swapped for the area. I'm told this was a SEO tactic circa 2001. Obviously this is an issue. So the question - should I 301 redirect each of the links to the Croydon page? Or (what I believe to be the best answer) set a rel=canonical tag on the duplicate pages). Creating "real and meaningful content" on each page isn't quite an option, sorry!
On-Page Optimization | | JamesFx0 -
Prevent indexing of dynamic content
Hi folks! I discovered bit of an issue with a client's site. Primarily, the site consists of static html pages, however, within one page (a car photo gallery), a line of php coding: dynamically generates a 100 or so pages comprising the photo gallery - all with the same page title and meta description. The photo gallery script resides in the /gallery folder, which I attempted to block via robots.txt - to no avail. My next step will be to include a: within the head section of the html page, but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated by the call to the php script residing a bit further down on the page? Dino
On-Page Optimization | | SCW0 -
Duplicate content
Hi everybody, I am thrown into a SEO project of a website with a duplicate content problem because of a version with and a version without 'www' . The strange thing is that the version with www. has got more than 10 times more Backlings but is not in the organic index. Here are my questions: 1. Should I go on using the "without www" version as the primary resource? 2. Which kind of redirect is best for passing most of the link juice? Thanks in advance, Sebastian
On-Page Optimization | | Naturalmente0