"Duplicate" Page Titles and Content
-
Hi All,
This is a rather lengthy one, so please bear with me!
SEOmoz has recently crawled 10,000 webpages from my site, FrenchEntree, and has returned 8,000 errors of duplicate page content. The main reason I have so many is because of the directories I have on site.
The site is broken down into 2 levels of hierachy. "Weblets" and "Articles". A weblet is a landing page, and articles are created within these weblets. Weblets can hold any number of articles - 0 - 1,000,000 (in theory) and an article must be assigned to a weblet in order for it to work. Here's how it roughly looks in URL form - http://www.mysite.com/[weblet]/[articleID]/
Now; our directory results pages are weblets with standard content in the left and right hand columns, but the information in the middle column is pulled in from our directory database following a user query. This happens by adding the query string to the end of the URL. We have 3 main directory databases, but perhaps around 100 weblets promoting various 'canned' queries that users may want to navigate straight into. However, any one of the 100 directory promoting weblets could return any query from the parent directory database with the correct query string. The problem with this method (as pointed out by the 8,000 errors) is that each possible permutation of search is considered to be it's own URL, and therefore, it's own page.
The example I will use is the first alphabetically. "Activity Holidays in France":
http://www.frenchentree.com/activity-holidays-france/ - This link shows you a results weblet without the query at the end, and therefore only displays the left and right hand columns as populated.
http://www.frenchentree.com/activity-holidays-france/home.asp?CategoryFilter= - This link shows you the same weblet with the an 'open' query on the end. I.e. display all results from this database. Listings are displayed in the middle.
There are around 500 different URL permutations for this weblet alone when you take into account the various categories and cities a user may want to search in.
What I'd like to do is to prevent SEOmoz (and therefore search engines) from counting each individual query permutation as a unique page, without harming the visibility that the directory results received in SERPs. We often appear in the top 5 for quite competitive keywords and we'd like it to stay that way. I also wouldn't want the search engine results to only display (and therefore direct the user through to) an empty weblet by some sort of robot exclusion or canonical classification.
Does anyone have any advice on how best to remove the "duplication" problem, whilst keeping the search visibility? All advice welcome.
Thanks
Matt
-
Thanks for the swift response, Gianluca. I think I understand the problem you have pointed out, but I'm rather surprised that it has been set up in such a way... Or that that would have more of an adverse affect than multiple URLs with the same standard content. I'm willing to change that to see if it fixes the problem though.
Please take all of the time you need... It is a very large site which has been pieced together, bit-by-bit, over many years!
Matt
-
In addition to Gianluca's response there, the pages that you tag with "noindex,follow" (i.e. the duplicates) add a canonical tag pointing at the original page.
-
I think your problem of duplicated content is also due the pagination your categories (or no categories search result) have. Checking the second url you gave http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&webname=activity-holidays-france&pagenumber=1 and it "second" page http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&pagenumber=2 I noticed that you have the meta robots in the head... therefore the bots see and index all these paginated content, that is a substantial duplicate of page 1. I suggest you to start adding the noindex,follow meta robots in these pages. About other duplication issues... give me time, as your site is not so easy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content due to parked domains
I have a main ecommerce website with unique content and decent back links. I had few domains parked on the main website as well specific product pages. These domains had some type in traffic. Some where exact product names. So main main website www.maindomain.com had domain1.com , domain2.com parked on it. Also had domian3.com parked on www.maindomain.com/product1. This caused lot of duplicate content issues. 12 months back, all the parked domains were changed to 301 redirects. I also added all the domains to google webmaster tools. Then removed main directory from google index. Now realize few of the additional domains are indexed and causing duplicate content. My question is what other steps can I take to avoid the duplicate content for my my website 1. Provide change of address in Google search console. Is there any downside in providing change of address pointing to a website? Also domains pointing to a specific url , cannot provide change of address 2. Provide a remove page from google index request in Google search console. It is temporary and last 6 months. Even if the pages are removed from Google index, would google still see them duplicates? 3. Ask google to fetch each url under other domains and submit to google index. This would hopefully remove the urls under domain1.com and doamin2.com eventually due to 301 redirects. 4. Add canonical urls for all pages in the main site. so google will eventually remove content from doman1 and domain2.com due to canonical links. This wil take time for google to update their index 5. Point these domains elsewhere to remove duplicate contents eventually. But it will take time for google to update their index with new non duplicate content. Which of these options are best best to my issue and which ones are potentially dangerous? I would rather not to point these domains elsewhere. Any feedback would be greatly appreciated.
Intermediate & Advanced SEO | | ajiabs0 -
Duplicate content issue with pages that have navigation
We have a large consumer website with several sections that have navigation of several pages. How would I prevent the pages from getting duplicate content errors and how best would I handle SEO for these? For example we have about 500 events with 20 events showing on each page. What is the best way to prevent all the subsequent navigation pages from getting a duplicate content and duplicate title error?
Intermediate & Advanced SEO | | roundbrix0 -
About duplicate content
We have to products: - loan for a new car
Intermediate & Advanced SEO | | KBC
- load for a second hand car Except for title tag, meta desc and H1, the content is of course very similmar. Are these pages considered as duplicate content? https://new.kbc.be/product/lenen/voertuig/autolening-tweedehands-auto.html
https://new.kbc.be/product/lenen/voertuig/autolening-nieuwe-auto.html thanks for the advice,0 -
Page Title Tag operands , - |
Hi, Anyone have any good suggestions about using commas, hyphens, vertical bar in the title tag and how it affects rankings? Thanks.
Intermediate & Advanced SEO | | bjs20100 -
Site been plagiarised - duplicate content
Hi, I look after two websites, one sells commercial mortgages the other sells residential mortgages. We recently redesigned both sites, and one was moved to a new domain name as we rebranded it from being a trading style of the other brand to being a brand in its own right. I have recently discovered that one of my most important pages on the residential mortgages site is not in Google's index. I did a bit of poking around with Copyscape and found another broker has copied our page almost word-for-word. I then used copyscape to find all the other instances of plagiarism on the other broker's site and there are a few! It now looks like they have copied pages from our commercial mortgages site as well. I think the reason our page has been removed from the index is that we relaunced both these sites with new navigation and consequently new urls. Can anyone back me up on this theory? I am 100% sure that our page is the original version because we write everything in-house and I check it with copyscape before it gets published, Also the fact that this other broker has copied from several different sites corroborates this view. Our legal team has written two letters (not sent yet) - one to the broker and the other to the broker's web designer. These letters ask the recipient to remove the copied content within 14 days. If they do remove our content from our site, how do I get Google to reindex our pages, given that Google thinks OUR pages are the copied ones and not the other way around? Does anyone have any experience with this? Or, will it just happen automatically? I have no experience of this scenario! In the past, where I've found duplicate content like this, I've just rewritten the page, and chalked it up to experience but I don't really want to in this case because, frankly, the copy on these pages is really good! And, I don't think it's fair that someone else could potentially be getting customers that were persuaded by OUR copy. Any advice would be greatly appreciated. Thanks, Amelia
Intermediate & Advanced SEO | | CommT0 -
Penalized for Duplicate Page Content?
I have some high priority notices regarding duplicate page content on my website www.3000doorhangers.com Most of the pages listed here are on our sample pages: http://www.3000doorhangers.com/home/door-hanger-pricing/door-hanger-design-samples/ On the left side of our page you can go through the different categories. Most of the category pages have similar text. We mainly just changed the industry on each page. Is this something that google would penalize us for? Should I go through all the pages and use completely unique text for each page? Any suggestions would be helpful Thanks! Andrea
Intermediate & Advanced SEO | | JimDirectMailCoach0 -
Why is Google rewriting titles with the brandname @ the front followed with a conon " : " i.e. > Brandname: the rest of the title
Example: https://www.google.nl/search?q=providercheck.nl&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a#bav=on.2,or.r_cp.r_qf.&ei=9xUCUuH6DYPePYHSgKgJ&fp=96e0b845c2047734&q=www.providercheck.nl&rls=org.mozilla:en-US:official&sa=X&spell=1&ved=0CC4QBSgA Look @ the first result: www.providercheck.nl
Intermediate & Advanced SEO | | Zanox0 -
Duplicate Content Help
seomoz tool gives me back duplicate content on both these URL's http://www.mydomain.com/football-teams/ http://www.mydomain.com/football-teams/index.php I want to use http://www.mydomain.com/football-teams/ as this just look nice & clean. What would be best practice to fix this issue? Kind Regards Eddie
Intermediate & Advanced SEO | | Paul780