Why are there significant changes in the amount of duplicate content without any known action?
-
I've noticed a surprisingly rapid change in duplicate content over the past month. I'd noticed ~6,000 instances of duplicate content, after disavowing bad links we went down to 3k, this makes perfect sense to me. But after that, without doing anything whatsoever, from last Thursday, the 20th, to yesterday the instances of duplicate content decreased again down to 2k. Could this just be a delayed indexing of pages or are there other factors here? Thanks for the help.
-
Come to think of it, the only thing we did do was fire the SEO company that we had working for us, and started doing SEO in house, but this doesn't make sense in terms of rapid shifts in duplicate content.
-
Without really being involved, it is very hard to try and figure this out exactly.
For now, I wouldn't worry unless you start to see problems, such as a drop in the number of pages actually indexed, drop in traffic or searches where you appear.
-Andy
-
To the best of my knowledge we've changed nothing about our site recently which is why I'm trying to attribute this rapid drop to something and the only thing we've done is disavow the links. So the disavow was just a shot in the dark to try to understand these changes.
-
Are you using any parameters (tracking/session id's) on your site? Also, what Andy said--disavowing wouldn't decrease this #. It was something else.
-
You can get problems with duplicate content from all over the web, but a disavow would have absolutely no impact on this. That is to distance you from external links that you don't wish to be associated with.
As this is a something related to the MOZ products, I can't give you an answer on that I'm afraid.
Have you made no actual changes to the site that could account for this? If you can, re-categorise this post to include Product Support.
-Andy
-
Can you look at your crawl diagnostics and see the difference in how many pages were crawled at each of those intervals? That would help diagnose what's happening here.
Thanks
-
I was under the impression that duplicate content can not only be caused by duplicate content on the site but actually also from outside sites, even notable ones, using directly duplicate content. http://moz.com/blog/duplicate-content-in-a-post-panda-world
See below:
(3) Cross-domain Duplicates
A cross-domain duplicate occurs when two websites share the same piece of content:
These duplicates could be either “true” or “near” duplicates. Contrary to what some people believe, cross-domain duplicates can be a problem even for legitimate, syndicated content.
Anyway, we're using Moz's dashboard to give us insights into duplicate content.
-
Hi,
First of all, disavowing will have nothing to do with the number of duplication warnings you get. This can only affect inbound links and even then, you won't see any drop in these through Webmaster Tools.
What are you using to see the duplicate pages?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Changing the way SEOmoz Detects Duplicate Content
Hey everyone, I wanted to highlight today's blog post in case you missed it. In short, we're using a different algorithm to detect duplicate pages. http://moz.com/blog/visualizing-duplicate-web-pages If you see a change in your crawl results and you haven't done anything, this is probably why. Here's more information taken directly from the post: 1. Fewer duplicate page errors: a general decrease in the number of reported duplicate page errors. However, it bears pointing out that: **We may still miss some near-duplicates. **Like the current heuristic, only a subset of the near-duplicate pages is reported. **Completely identical pages will still be reported. **Two pages that are completely identical will have the same simhash value, and thus a difference of zero as measured by the simhash heuristic. So, all completely identical pages will still be reported. 2. Speed, speed, speed: The simhash heuristic detects duplicates and near-duplicates approximately 30 times faster than the legacy fingerprints code. This means that soon, no crawl will spend more than a day working its way through post-crawl processing, which will facilitate significantly faster delivery of results for large crawls.
Moz Pro | | KeriMorgret2 -
Can i force another crawl on my site to see if it recognizes my changes?
i had a problem w/dup content and titles on my site, i fixed them immediately and im wondering if i can run another crawl on my site to see if my changes were recognized thanks shaun
Moz Pro | | daugherty0 -
Crawl reports urls with duplicate content but its not the case
Hi guys!
Moz Pro | | MakMour
Some hours ago I received my crawl report. I noticed several records with urls with duplicate content so I went to open those urls one by one.
Not one of those urls were really with duplicate content but I have a concern because website is about product showcase and many articles are just images with href behind them. Many of those articles are using the same images so maybe thats why the seomoz crawler duplicate content flag is raised. I wonder if Google has problem with that too. See for yourself how it looks like: http://by.vg/NJ97y
http://by.vg/BQypE Those two url's are flagged as duplicates...please mind the language(Greek) and try to focus on the urls and content. ps: my example is simplified just for the purpose of my question. <colgroup><col width="3436"></colgroup>
| URLs with Duplicate Page Content (up to 5) |0 -
About Duplicate Content found by SEOMOZ... that is not duplicate
Hi folks, I am hunting for duplicate content based on SEOMOZ great tool for that 🙂 I have some pages that are mentioned as duplicate but I cant say why. They are video page. The content is minimalistic so I guess it might be because all the navigation is the same but for instance http://www.nuxeo.com/en/resource-center/Videos/Nuxeo-World-2010/Nuxeo-World-2010-Presentation-Thierry-Delprat-CTO and http://www.nuxeo.com/en/resource-center/Videos/Nuxeo-World-2010/Nuxeo-World-2010-Presentation-Cheryl-McKinnon-CMO are mentioned as duplicate. Any idea? Is it hurting? Cheers,
Moz Pro | | nuxeo0 -
On-page Optimization Grade Change
I can see the grade change for my on-page optimization in the weekly email, however, when I load the summary page on only rank change shows, grade change is blank across the board. I also tried downloading and see the same results. Is this a bug on the website? Thanks!
Moz Pro | | leighw0 -
Will canonical tag get rid of duplicate page title errors?
I have a directory on my website, paginated in groups of 10. On page 2 of the results, the title tag is the same as the first page, as it is on the 3rd page and so on. This is giving me duplicate page title errors. If i use rel=canonical tags on the subsequent pages and href the first page of my results, will my duplicate page title warnings go away? thanks.
Moz Pro | | fourthdimensioninc0 -
How can I clean up my crawl report from duplicate records?
I am viewing my Crawl Diagnostics Report. My report is filled with data which really shouldn't be there. For example I have a page: http://www.terapvp.com/forums/Ghost/ This is a main forum page. It contains a list of many threads. The list can be sorted on many values. The page is canonicalized, and has been since it was created. My crawl report shows this page listed 15 times. http://www.terapvp.com/forums/Ghost/?direction=asc http://www.terapvp.com/forums/Ghost/?direction=desc http://www.terapvp.com/forums/Ghost/?order=post_date and so forth. Each of those pages uses the same canonicalization reference shared above. I have three questions: Why is this data appearing in my crawl report? These pages are properly canonicalized. If these pages are supposed to appear in the report for some reason, how can I remove them? My desire is to focus on any pages which may have an issue which needs to be addressed. This site has about 50 forum pages and when you add an extra 15 pages per forum, it becomes a lot harder to locate actionable data. To make matters worse, these forum indexes often have many pages. So if I have a "Corvette" forum there that is 10 pages long, then there will be 150 extra pages just for that particular forum in my crawl report. Is there anything I am missing? To the best of my knowledge everything is set up according to the best SEO practices. If there is any other opinions, I would like to hear them.
Moz Pro | | RyanKent0 -
Tool for scanning the content of the canonical tag
Hey All, question for you. What is your favorite tool/method for scanning a website for specific tags? Specifically (as my situation dictates now) for canonical tags? I am looking for a tool that is flexible, hopefully free, and highly customizable (for instance, you can specify the tag to look for). I like the concept of using google docs with the import xml feature but as you can only use 50 of those commands at a time it is very limiting (http://www.distilled.co.uk/blog/seo/how-to-build-agile-seo-tools-using-google-docs/). I do have a campaign set up using the tools which is great! but I need something that returns a response faster and can get data from more than 10,000 links. Our cms unfortunately puts out some odd canonical tags depending on how a page is rendered and I am trying to catch them quickly before it gets indexed and causes problems. Eventually I would also like to be able to scan for other specific tags, hence the customizable concern. If we have to write a vb script to get it into excel I suppose we can do that. Cheers, Josh
Moz Pro | | prima-2535090