"Duplicate" Page Titles and Content
-
Hi All,
This is a rather lengthy one, so please bear with me!
SEOmoz has recently crawled 10,000 webpages from my site, FrenchEntree, and has returned 8,000 errors of duplicate page content. The main reason I have so many is because of the directories I have on site.
The site is broken down into 2 levels of hierachy. "Weblets" and "Articles". A weblet is a landing page, and articles are created within these weblets. Weblets can hold any number of articles - 0 - 1,000,000 (in theory) and an article must be assigned to a weblet in order for it to work. Here's how it roughly looks in URL form - http://www.mysite.com/[weblet]/[articleID]/
Now; our directory results pages are weblets with standard content in the left and right hand columns, but the information in the middle column is pulled in from our directory database following a user query. This happens by adding the query string to the end of the URL. We have 3 main directory databases, but perhaps around 100 weblets promoting various 'canned' queries that users may want to navigate straight into. However, any one of the 100 directory promoting weblets could return any query from the parent directory database with the correct query string. The problem with this method (as pointed out by the 8,000 errors) is that each possible permutation of search is considered to be it's own URL, and therefore, it's own page.
The example I will use is the first alphabetically. "Activity Holidays in France":
http://www.frenchentree.com/activity-holidays-france/ - This link shows you a results weblet without the query at the end, and therefore only displays the left and right hand columns as populated.
http://www.frenchentree.com/activity-holidays-france/home.asp?CategoryFilter= - This link shows you the same weblet with the an 'open' query on the end. I.e. display all results from this database. Listings are displayed in the middle.
There are around 500 different URL permutations for this weblet alone when you take into account the various categories and cities a user may want to search in.
What I'd like to do is to prevent SEOmoz (and therefore search engines) from counting each individual query permutation as a unique page, without harming the visibility that the directory results received in SERPs. We often appear in the top 5 for quite competitive keywords and we'd like it to stay that way. I also wouldn't want the search engine results to only display (and therefore direct the user through to) an empty weblet by some sort of robot exclusion or canonical classification.
Does anyone have any advice on how best to remove the "duplication" problem, whilst keeping the search visibility? All advice welcome.
Thanks
Matt
-
Thanks for the swift response, Gianluca. I think I understand the problem you have pointed out, but I'm rather surprised that it has been set up in such a way... Or that that would have more of an adverse affect than multiple URLs with the same standard content. I'm willing to change that to see if it fixes the problem though.
Please take all of the time you need... It is a very large site which has been pieced together, bit-by-bit, over many years!
Matt
-
In addition to Gianluca's response there, the pages that you tag with "noindex,follow" (i.e. the duplicates) add a canonical tag pointing at the original page.
-
I think your problem of duplicated content is also due the pagination your categories (or no categories search result) have. Checking the second url you gave http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&webname=activity-holidays-france&pagenumber=1 and it "second" page http://www.frenchentree.com/activity-holidays-france/home.asp?order=Sort1&option=&CategoryFilter=&webname=activity-holidays-france&pagenumber=2 I noticed that you have the meta robots in the head... therefore the bots see and index all these paginated content, that is a substantial duplicate of page 1. I suggest you to start adding the noindex,follow meta robots in these pages. About other duplication issues... give me time, as your site is not so easy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
After adding a ssl certificate to my site I encountered problems with duplicate pages and page titles
Hey everyone! After adding a ssl certificate to my site it seems that every page on my site has duplicated it's self. I think that is because it has combined the www.domainname.com and domainname.com. I would really hate to add a rel canonical to every page to solve this issue. I am sure there is another way but I am not sure how to do it. Has anyone else ran into this problem and if so how did you solve it? Thanks and any and all ideas are very appreciated.
Intermediate & Advanced SEO | | LovingatYourBest0 -
Duplicate Content... Really?
Hi all, My site is www.actronics.eu Moz reports virtually every product page as duplicate content, flagged as HIGH PRIORITY!. I know why. Moz classes a page as duplicate if >95% content/code similar. There's very little I can do about this as although our products are different, the content is very similar, albeit a few part numbers and vehicle make/model. Here's an example:
Intermediate & Advanced SEO | | seowoody
http://www.actronics.eu/en/shop/audi-a4-8d-b5-1994-2000-abs-ecu-en/bosch-5-3
http://www.actronics.eu/en/shop/bmw-3-series-e36-1990-1998-abs-ecu-en/ate-34-51 Now, multiply this by ~2,000 products X 7 different languages and you'll see we have a big dupe content issue (according to Moz's Crawl Diagnostics report). I say "according to Moz..." as I do not know if this is actually an issue for Google? 90% of our products pages rank, albeit some much better than others? So what is the solution? We're not trying to deceive Google in any way so it would seem unfair to be hit with a dupe content penalty, this is a legit dilemma where our product differ by as little as a part number. One ugly solution would be to remove header / sidebar / footer on our product pages as I've demonstrated here - http://woodberry.me.uk/test-page2-minimal-v2.html since this removes A LOT of page bloat (code) and would bring the page difference down to 80% duplicate.
(This is the tool I'm using for checking http://www.webconfs.com/similar-page-checker.php) Other "prettier" solutions would greatly appreciated. I look forward to hearing your thoughts. Thanks,
Woody 🙂1 -
How to avoid too many "On Page Links"?
Hi everyone I don't seem to be able to keep big G off my back, even though I do not engage in any black hat or excessive optimization practices. Due to another unpleasant heavy SERP "fluctuation" I am in investigation mode yet again and want to take a closer look at one of the warnings within the SEOmoz dashboard, which is "Too many on page links". Looking at my statistics this is clearly the case. I wonder how you can even avoid that at times. I have a lot of information on my homepage that links out to subpages. I get the feeling that even the links within the roll-over menus (or dropdown) are counted. Of course, in that case then you will end up with a crazy amount of on page links. What about blog-like news entries on your homepage that link to other pages as well? And not to forget the links that result from the tags underneath a post? What am I trying to get at? Well, do you feel that a bad website template may cause this issue i.e. are the links from roll-over menus counted as links on the homepage even though they are not directly visible? I am not sure how to cut down on the issue as the sidebar modules are present on every page and thus up the links count wherever you are on the site. On another note, I've seen plenty of homepages with excessive information and links going out, would they be suffering from the search engines' hammer too? How do you manage the too many on page links issue? Many thanks for your input!
Intermediate & Advanced SEO | | Hermski0 -
Frequent FAQs vs duplicate content
It would be helpful for our visitors if we were to include an expandable list of FAQs on most pages. Each section would have its own list of FAQs specific to that section, but all the pages in that section would have the same text. It occurred to me that Google might view this as a duplicate content issue. Each page _does _have a lot of unique text, but underneath we would have lots of of text repeated throughout the site. Should I be concerned? I guess I could always load these by AJAX after page load if might penalize us.
Intermediate & Advanced SEO | | boxcarpress0 -
Advice needed on how to handle alleged duplicate content and titles
Hi I wonder if anyone can advise on something that's got me scratching my head. The following are examples of urls which are deemed to have duplicate content and title tags. This causes around 8000 errors, which (for the most part) are valid urls because they provide different views on market data. e.g. #1 is the summary, while #2 is 'Holdings and Sector weightings'. #3 is odd because it's crawling the anchored link. I didn't think hashes were crawled? I'd like some advice on how best to handle these, because, really they're just queries against a master url and I'd like to remove the noise around duplicate errors so that I can focus on some other true duplicate url issues we have. Here's some example urls on the same page which are deemed as duplicates. 1) http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE http://markets.ft.com/Research/Markets/Tearsheets/Holdings-and-sectors-weighting?s=IVPM:LSE http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE&widgets=1 What's the best way to handle this?
Intermediate & Advanced SEO | | SearchPM0 -
Is a "Critical Acclaim" considered duplicate content on an eCommerce site?
I have noticed a lot of wine sites use "Critical Acclaims" on their product pages. These short descriptions made by industry experts are found on thousands of other sites. One example can be found on a Wine.com product page. Wine.com also provides USG through customer reviews on the page for original content. Are the "Critical Acclaim" descriptions considered duplicate content? Is there a way to use this content and it not be considered duplicate (i.e. link to the source)?
Intermediate & Advanced SEO | | mj7750 -
HTTPS Duplicate Content?
I just recieved a error notification because our website is both http and https. http://www.quicklearn.com & https://www.quicklearn.com. My tech tells me that this isn't actually a problem? Is that true? If not, how can I address the duplicate content issue?
Intermediate & Advanced SEO | | QuickLearnTraining0 -
What constitutes duplicate content?
I have a website that lists various events. There is one particular event at a local swimming pool that occurs every few months -- for example, once in December 2011 and again in March 2012. It will probably happen again sometime in the future too. Each event has its own 'event' page, which includes a description of the event and other details. In the example above the only thing that changes is the date of the event, which is in an H2 tag. I'm getting this as an error in SEO Moz Pro as duplicate content. I could combine these pages, since the vast majority of the content is duplicate, but this will be a lot of work. Any suggestions on a strategy for handling this problem?
Intermediate & Advanced SEO | | ChatterBlock0