Duplicate pages or note? Variations just due to language changes?
-
I have some pages marked as duplicates, so I want to do what I can to solve the issues concerned.
One issue concerns duplicates where the page content is indeed the same except for the language that the content is offered in.
The URL for example of the documentation page of the site, in English is as follows:
http://www.domain.com/support/documentationWe then have the same content in German, French, Russian using the following URLs.
http://www.domain.com/de/support/documentation
http://www.domain.com/fr/support/documentation
http://www.domain.com/ru/support/documentationEach page has links to PDFs which are all in fact in English so the links to the docs are the same. Moz is flagging up all these pages as being duplicate content (which it is when translated back into English, but is not if you just consider that they are using completely different languages!)
Has anyone any thoughts on how to solve this? Or is this something not to worry about / disregard?
Many thanks
Simon
-
Ryan - thank you too for taking the time to respond.
Had a quick peek at the blog you noted - going to go back and read it v-e-r-y slowly!
Ditto, many thanks re the webconfs link - plenty of fun tools to try out there. Am sure it will all deepen my learning / confusion!Thanks again!
-
Don - thank you so much for responding
I think you may well have identified the issue too! The documentation page is the same in each case - has a table with e.g. 4 columns and 20 or so rows - and I guess much of the structural content of the pages, irrespective of the linguistic variations of the text shown on screen, is the same.
Will look closer at this.
Assuming this is the case - the next logical questions would be: Does it matter in terms of SEO? Or is it a kind of 'false positive' which can be noted but ignored? What could I do about it anyway? I guess the answer is implied in your answer above: change the template for each language?
Allied to this, is the fact that since the site is 'growing' with multiple language versions, the problem seen with this sample page will potentially be replicated all over the site. Again, the big question is about the effect on SEO. Web pages are scoring well for brand terms and other important words, and while there are new phrases and words to focus on, I am unsure whether correcting these is to prevent strict penalties or simply to make already-decent rankings as good as they can be.
Thanks in advance for any further points you care to make.
-
Hi Simon. Don has given you some good guidance. Here's a recent Moz Dev Blog post on the subject: https://mza.bundledseo.com/devblog/near-duplicate-detection/. Note their images explaining much of what Don described. Two pages having enough shared phrases (because of the header, footer, nav, etc) can trigger the duplicate warning. While the latter part of the dev blog post certainly gets technical, it should explain why you might be getting duplicate content warnings even further if that's your bent.
Since each tool is a bit different you can also check your pages with other tools, such as: http://www.webconfs.com. Cheers!
-
Hi Simon,
Okay so crawlers can crawl PDF's unless they are encrypted / encoded. However since they link to the PDF that shouldn't be the issue.Ref: googleblog
How much content are on these pages? I ask because when there is thin content you may find that the template itself is causing the duplication problem, unless of course you are using different templates for each language as well.
Take for example a page that reads.
en: The woman eats frozen fruit daily.
de: Die Frau isst gefrorenes Gemüse jegen tag.
es: La mujer come las verduras congeladas diariaNow surround each of those pages with a header content, footer content, right / left column content same images same alt tags and the deviation of content is so small it is not noticed.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Thousands of 404-pages, duplicate content pages, temporary redirect
Hi, i take over the SEO of a quite large e-commerce-site. After checking crawl issues, there seems to be +3000 4xx client errors, +3000 duplicate content issues and +35000 temporary redirects. I'm quite desperate regarding these results. What would be the most effective way to handle that. It's a magento shop. I'm grateful for any kind of help! Thx,
Technical SEO | | posthumus
boris0 -
3,511 Pages Indexed and 3,331 Pages Blocked by Robots
Morning, So I checked our site's index status on WMT, and I'm being told that Google is indexing 3,511 pages and the robots are blocking 3,331. This seems slightly odd as we're only disallowing 24 pages on the robots.txt file. In light of this, I have the following queries: Do these figures mean that Google is indexing 3,511 pages and blocking 3,331 other pages? Or does it mean that it's blocking 3,331 pages of the 3,511 indexed? As there are only 24 URLs being disallowed on robots.text, why are 3,331 pages being blocked? Will these be variations of the URLs we've submitted? Currently, we don't have a sitemap. I know, I know, it's pretty unforgivable but the old one didn't really work and the developers are working on the new one. Once submitted, will this help? I think I know the answer to this, but is there any way to ascertain which pages are being blocked? Thanks in advance! Lewis
Technical SEO | | PeaSoupDigital0 -
Redirects on multi language site and language detection
Hello! I have a multi language site in German and English. The site ranked well for the brand name and for German keywords. But after switching to a contend delivery network and changing the language detection method from browser language to IP location the site had indexing and ranking problems. Also in SERP the English homepage is shown for German keywords. On the other hand the language detection method is more accurate now. Current setup: The languages are separated via a folder structure for the languages: www.site.com/**en **and www.site.com/de. If the users IP is in Germany he is redirected via 302 from .com to .com/de. The rest of the world is redirected via 302 to .com/en. So the root www.site.com/ doesn't exist but has the most of the backlinks. Each folder has one sitemap under /de/sitemap.xml and /en/sitemap.xml. Each site and the root (.com, .com/de, .com/en) was added to WMT (no geo targeting) and the sitemaps were added (on the root domain both sitemaps and on the language specific sites just one). The sitemaps have no hreflang tag. Each page has an hreflang tag in the header pointing to itself and the alternate language. hreflang="x-default" is not set anywhere. Also on each page is a link to change language. Goals: From an SEO point of view we primarily target German speaking people. But a lot of international people (US, South America, Europe) search for our brand name so we want to serve them the English site. Therefore we want to: Get all the link juice when someone links to www.site.com to the German site Show Germans the German site in SERP and all others the English one Still serve the language automatically based on the location Do you have any idea how to achieve this? I think our main problem is that we want to push the German site the most but still serve the English site for most people (and therefore the Google Bot). Also does submitting the same sitemap twice (on the domain site and folders) do any harm? Any help oder links to ressources are greatly appreciated. I read a ton of articles but they are mostly for the case that english is the default language. Thanks for you help Moz community! Alex
Technical SEO | | AlexBLN1 -
Duplicate Page Content error but I can't see it
Hi All We're getting a lot of Duplicate Page Content errors but I can't match it up. For example this page: http://www.daytripfinder.co.uk/attractions/32-antique-cottage It is saying the on page properties as follows: Title DayTripFinder - Things to do reviewed by you - 7,000 attractions <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">Meta Description</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">Read Reviews, Browse Opening Hours and Prices. View Photos, Maps. 7,000 UK Visitor Attractions.</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">But this isn't the page title or meta description.
Technical SEO | | KateWaite85
</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">And it's showing five (many others) example pages that share it. Again the page titles and description are different.</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">http://www.daytripfinder.co.uk/attractions/mckinlay-theatre</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">http://www.daytripfinder.co.uk/attractions/bakers-dolphin</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">http://www.daytripfinder.co.uk/attractions/shipley-park-fishing</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">http://www.daytripfinder.co.uk/attractions/king-johns-lodge-and-gardens</dt> <dt style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;">http://www.daytripfinder.co.uk/attractions/city-hall
</dt> Any ideas? Not sure if I'm missing something here! Thanks!0 -
Big on-page changes: all at once or little by little?
Hello, I have a website with bad urls (http://www.domain.com/site/page.php will become http://www.domain.com/page.php) and bad titles (keyword stuffing) and I have to fix all of them. Should I change the urls (and, yes, make the proper 301 redirects) and get crawled by the S.E., before I give a unique title for each page or viceversa? Or maybe shoud I do everything at once? What's the best practice to minimize the ranking drop and how long will it take to resume (and increase) the previous ranking?
Technical SEO | | DoMiSoL0 -
Fixing Duplicate Pages Titles/Content
I have a DNN site, which I created friendly URL's for; however, the creation of the friendly URL's then created duplicate page content and titles. I was able to fix all but two URL's with rel="canonical" links. BUT The two that are giving me the most issues are pointing to my homepage. When I added the rel = "canonical" link the page then becomes not indexable. And for whatever reason, I can't add a 301 redirect to the homepage because it then gives me "can't display webpage" error message. I am new to SEO and to DNN, so any help would be greatly appreciated.
Technical SEO | | VeronicaCFowler0 -
Duplicate page error
SEO Moz gives me an duplicate page error as my homepage www.monteverdetours.com is the same as www.monteverdetours.com/index is this actually en error? And is google penalizing me for this?
Technical SEO | | Llanero0 -
Search/Search Results Page & Duplicate Content
If you have a page whose only purpose is to allow searches and the search results can be generated by any keyword entered, should all those search result urls be no index or rel canonical? Thanks.
Technical SEO | | cakelady0