SEO Myth-Busters -- Isn't there a "duplicate content" penalty by another name here?
-
Where is that guy with the mustache in the funny hat and the geek when you truly need them?
So SEL (SearchEngineLand) said recently that there's no such thing as "duplicate content" penalties.
http://searchengineland.com/myth-duplicate-content-penalty-259657
by the way, I'd love to get Rand or Eric or others Mozzers aka TAGFEE'ers to weigh in here on this if possible.
The reason for this question is to double check a possible 'duplicate content" type penalty (possibly by another name?) that might accrue in the following situation.
1 - Assume a domain has a 30 Domain Authority (per OSE)
2 - The site on the current domain has about 100 pages - all hand coded. Things do very well in SEO because we designed it to do so.... The site is about 6 years in the current incarnation, with a very simple e-commerce cart (again basically hand coded). I will not name the site for obvious reasons.
3 - Business is good. We're upgrading to a new CMS. (hooray!) In doing so we are implementing categories and faceted search (with plans to try to keep the site to under 100 new "pages" using a combination of rel canonical and noindex. I will also not name the CMS for obvious reasons.
In simple terms, as the site is built out and launched in the next 60 - 90 days, and assume we have 500 products and 100 categories, that yields at least 50,000 pages - and with other aspects of the faceted search, it could create easily 10X that many pages.
4 - in ScreamingFrog tests of the DEV site, it is quite evident that there are many tens of thousands of unique urls that are basically the textbook illustration of a duplicate content nightmare. ScreamingFrog has also been known to crash while spidering, and we've discovered thousands of URLS of live sites using the same CMS.
There is no question that spiders are somehow triggering some sort of infinite page generation - and we can see that both on our DEV site as well as out in the wild (in Google's Supplemental Index).
5 - Since there is no "duplicate content penalty" and there never was - are there other risks here that are caused by infinite page generation?? Like burning up a theoretical "crawl budget" or having the bots miss pages or other negative consequences?
6 - Is it also possible that bumping a site that ranks well for 100 pages up to 10,000 pages or more might very well have a linkuice penalty as a result of all this (honest but inadvertent) duplicate content? In otherwords, is inbound linkjuice and ranking power essentially divided by the number of pages on a site? Sure, it may be some what mediated by internal page linkjuice, but what's are the actual big-dog issues here?
So has SEL's "duplicate content myth" truly been myth-busted in this particular situation?
???
Thanks a million!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the impact of HTTP/2 on SEO ?
I think it's good for the user experience and speeds up websites, especially if your site has a lot of requests. But i'm not sure if there are other side effects, and if there's an impact on SEO or technical configuration. Most of my websites are built with Wordpress, some with Joomla.
Algorithm Updates | | Croco_Web_Solutions1 -
Will Russia's New Data Protection Law Impact SEOs and SMBs Outside of Russia?
We've all seen the news recently that Google will be closing its engineering offices in Russia due to new data protection laws coming into place in January 2015. The same law has also led to Adobe pulling out of Russia earlier in the year. I was wondering how you think this will impact SEOers and small/medium businesses that market _to _Russia, but are based outside of the country? Personal data has been defined in the new legislation as: Personal data means any information directly or indirectly related to any identified or potentially identifiable person. It includes, among other things, first name and family name, date and place of birth, address, information about family status, education, profession, income Source For those businesses which don't process personal data (affiliates etc), will there be any foreseeable impact? On the flipside, are there any benefits here for affiliate businesses inside of Russia? I'm using affiliates as an example to get the ball rolling, but I'm sure there's numerous more. Personally, I'd be interested to hear if you think this may impact corporate websites which don't process personal data, but operate outside of Russia.
Algorithm Updates | | ecommercebc0 -
Pdfs for SEO - benefits, downfalls and promotional methods
Hi fellow Mozzers, We're just in the middle of relaunching our website (a design agency), and I had a few questions re: SEO of our service keywords. The designers want the site to seem light on content, despite my advice that this would reduce the terms we can rank for. With that in mind, I was going to include advice pages that can be found via the site map, site search or text links but aren't promoted via the top level or second level nav. Another alternative I was going to explore was using pdfs for design case studies, so the site would feature a light case study, but with a more in-depth pdf available if wanted. I have located numerous articles highlighting how best to optimise pdfs, but I have a few queries aside from the technical standpoint. So: is this the best way to getting round the issue of keeping the site 'light' on content? are there stats that show CTRs on pdf pages over HTML? as well as optimising the pdf content and promoting them on our social media channel, is there a benefit from including them on the likes of Scribd, Edocr and so on (from either an SEO or simply from a promotional viewpoint, or both) Hopefully that's all clear! Nick
Algorithm Updates | | themegroup0 -
Guides to determine if a client's website has been penalized?
Has anyone come across any great guides to pair with client data to help you determine if their website has been penalized? I'm also not talking about an obvious drop in traffic/rankings, but I want to know if there's a guide out there for detecting the subtleties that may be found in a client's website data. One that also helps you take into account all the different variables that may not be related to the engines. Thanks!
Algorithm Updates | | EEE30 -
Why is there no compiled list of the different types of search results on Google, and what the content qualifications are to generate those results?
Seems to me that this list should exist out there somewhere, but I can't seem to find it. Am I just not as good of a Googler as I thought I was?
Algorithm Updates | | Draftfcb0 -
2 Domains With Same Name But 1 With A Number
We have been marketing a website for a client with a domain name example2.com. Their main site example.com is used to post information about their services and example2.com is their eCommerce site they use to sell their products. After the Google Penguins update, we have lost all rankings for example2.com. We did not do any unethical, black hat SEO and I am pretty sure its wasn't just mistakenly blocked. We use the same strategy for our other clients and they have not been impacted. Do you guys think the domain name has anything to do with it? Whats odd it example.com is now ranking for some keywords that example2.com used to rank for. We have never marketed that website or anyone else for that matter. I have been scratching my head over this one for the past week and this is the only feasible problem I can think of.
Algorithm Updates | | ArgosSEM0 -
Forum software penalties
I'm hoping to solicit some feedback on what people feel would be SEO best practices for message board/forum software. Specifically, while message boards that are healthy can generate tons of unique content, they also can generate a fair share of thin content pages. These pages include... Calendar pages that can have a page for each day of each month for 10 years! (thats like 3650 pages of just links). User Profile pages, which depending on your setup can tend to be thin. The board I work with has 20k registered members, hence 20k user profile pages. User lists which can have several hundred pages. I believe Google is pretty good at understanding what is message board content, but there is still a good chance that one could be penalized for these harmless pages. Do people feel that the above pages should be noindexed? Another issue is that of unrelated content. Many forums have their off-topic areas (the Pub or Hangout or whatever). On our forum up to 40% of the content is off-topic (when I say content I mean number of post versus raw word count). What are the advantages and disadvantages of such content? On one hand they expand the keywords you can rank for. On the other hand it might generate google organic traffic which you might now want because of a high bounce rate. Does too much indexable content that is unique dilute your good content?
Algorithm Updates | | entropytc1