Duplicate page report
-
We ran a CSV spreadsheet of our crawl diagnostics related to duplicate URLS' after waiting 5 days with no response to how Rogerbot can be made to filter.
My IT lead tells me he thinks the label on the spreadsheet is showing “duplicate URLs”, and that is – literally – what the spreadsheet is showing.
It thinks that a database ID number is the only valid part of a URL. To replicate: Just filter the spreadsheet for any number that you see on the page. For example, filtering for 1793 gives us the following result:
|
URL
http://truthbook.com/faq/dsp_viewFAQ.cfm?faqID=1793
http://truthbook.com/index.cfm?linkID=1793
http://truthbook.com/index.cfm?linkID=1793&pf=true
http://www.truthbook.com/blogs/dsp_viewBlogEntry.cfm?blogentryID=1793
http://www.truthbook.com/index.cfm?linkID=1793
|
There are a couple of problems with the above:
1. It gives the www result, as well as the non-www result.
2. It is seeing the print version as a duplicate (&pf=true) but these are blocked from Google via the noindex header tag.
3. It thinks that different sections of the website with the same ID number the same thing (faq / blogs / pages)
In short: this particular report tell us nothing at all.
I am trying to get a perspective from someone at SEOMoz to determine if he is reading the result correctly or there is something he is missing?
Please help. Jim
-
Hi Jim!
Thanks for the question. One thing we should clarify before we move forward is that the Pro app doesn't actually report on duplicate URLs, but we do report when we find duplicate title tags or content.
Duplicate titles just refer to when we find the same title tag on more than one page. In one example from your diagnostics, we're reporting the title tag 'Truthbook Religious News' is being used in multiple pages (http://screencast.com/t/GYCKNfAoj).
Duplicate content is content we see on the source code of your pages that is identical or nearly identical and would cause the pages to compete against each other for rankings. To fix either of these you have a several options:
- Set up a 301 redirect to have the pages you would consider duplicate redirect to the main page.
- Change the content/title tags enough that they won't be considered duplicates - Canonicalize the content you would consider duplicates.
Most developers will go for the latter two options so that the pages will still be reachable by visitors. You can find out more about how to implement these in our Help Hub.
To answer your other questions:
1 - At the time of the crawl, we were able to get to sub domain pages from other pages on your site. The sub domains were also resolving separately, but they seem to be redirecting to your root domain now, so your next crawl should reflect this.
2 - Running a curl for the print versions of your pages, I see "no follow" tags related to Wikipedia links embedded (http://screencast.com/t/reYjeLLPvWG3) in the doc, but I'm not finding any "no index tags" (http://screencast.com/t/DsXMZInngSzH). This would be why you're seeing us crawling those pages.
3 - As I mentioned above, our crawler looks for similarities in the source code of pages when reporting on duplicate content. Since no one knows exactly how similar content would need to be for the search engines to consider it a duplicate, we err on the side of caution and recommended best practices when reporting them. Using one of the methods mentioned above and detailed in our Help Hub should resolve this for you
Let me know if you have any other questions!
Best,
Sam
Moz Helpster - Set up a 301 redirect to have the pages you would consider duplicate redirect to the main page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved How does Moz compile the "Important pages on your site are returning a 4xx error!" report?
I have over 200 links in this report (mostly from a staging site). I have deleted that staging site and I cannot find the reference to the other links. So my question is, where is Moz finding these links?
Moz Pro | | nomad_blogger0 -
Pages with URL Too Long
Hello Mozzers! MOZ keeps kindly telling me the URLs are too long. However, this is largely due to the structure of E-commerce site, which has to include 'brand' 'range' and 'products' keyword. For example -
Moz Pro | | tigersohelll
https://www.choicefurnituresuperstore.co.uk/Devonshire-Rustic-Oak-Bedside-Cabinet-1-Drawer-p40668.html MOZ recommends no more than 75 characters. This means we have 25-30 characters for both the brand name and product name. Questions:
If it is an issue, how to fix it on my site?
If it's not an issue, how can we turn off this alert from MOZ?
Anyone know how big an issue URLs are as a ranking factor? I thought pretty low.0 -
URL Parameters causing duplicate content - Login/Registration page
All, I just recently acquired a new client and right away I noticed an abundance of duplicate content being recorded after the moz crawl diagnostics was completed. After a quick digest of the issue, it seems that the majority (90%) of the outlined duplicated content is stemming from the client's Login/Registration page. Upon clicking (without being logged-in) any asset or forum discussion board link within the site, the user is automatically redirected to the Login/Registration page, which seems to create this massive redirect loop associated with dynamic url parameters. Ex. After clicking on a select internal link (asset or discussion board) the user is redirected to the Login/Register page which presents the page and a URL that looks a lot this this: Ex. 1 https://www.clientsite.com/register-login?ReturnUr...xxxx%xxxx%xxxx%...... Ex. 2 https://www.clientsite.com**/register-login?returnurl=/register-login?returnurl=/register-login?returnurl=/page-titl**e/ These URLs seem to becoming larger and larger... The client wants to ensure users have to Login/Register within their site before they're allowed to view the content. This process doesn't allow for any type of preview page to be viewed by a user prior to clicking on the internal link, which in turn doesn't allow any preview pages to be indexed. Right now, Moz is picking up all of the redirect and labeling them as duplicate page content/duplicate page titles based on the Login/Registration page. Questions/Comments: Would it be wise to create preview pages for the asset pages and discussion board pages to allow for proper indexing? - Could this be a CMS issue? Current being used on this is, Kentico. There are thousands of pages being recorded in the crawl as duplicate, however only 14 seem to be indexing with duplicate title tags. 301 or canonical redirect strategy? Moz crawl data issue? Again, this is my first look at this issue, so more information is bound to come out soon! Please let me know if anyone has run into this issue and if you have a possible solution to get rid of this redirect loop process. Thanks! -T
Moz Pro | | MattLacuesta0 -
One page report are empty !
Hi Rodgerbot, Now, i've no seomoz one page report for any campaign 😞 What happen ? I've previously several report. Thanks,
Moz Pro | | Max840 -
Duplicate Page Content and Title - Miva - How to fix?
Hi, I'm new to SEOmoz and just diving into it. I'm feeling a bit overwhelmed. I use Miva Merchant as my storefront interface. SEMOz is returning a bunch of duplicate page content and duplicate page titles and I can't figure out what to do about it. It seems it may have something to do with Miva shortlinks. I click on the dup URL's in SEMOz and it brings me to a dead page. I can't figure out where it's coming from. I know without seeing the actual information it'll probably be tough to help me but any suggestions would be appreciated. I try to fix them and come to a point (after about three hours of getting nowhere) it becomes too frustrating. Thanks!
Moz Pro | | musicforkids
Gary0 -
Best Automated Report?
I would like to implement a reporting function to my website to offer a bit of value and information to potential clients. I am thinking along the lines of a simple input form to include a business name and url. The output would be a clean, branded (my business, url, phone) report that shows opportunities, lowest lying fruit, keywords most prominent, and any errors. I found this site - http://www.analyticsseo.com/ but, it seems pretty expensive. Does anyone have any suggestions on another suite that might work? Many thanks!
Moz Pro | | adell500 -
On-Page URL
Hopefully I am missing something basic... I can't see how to specifically add and delete On-Page reports. It seems like running a report adds it but how to delete? Also, how does one change the URL for a report? I have re-organized some pages and can't seem the get the on-page report to keep my URL change. Here is what I tried. From the On-Page report card for a keyword I changed the URL and ran the test. Test runs ok but if I navigate back to the summary my old bad URL is still there.
Moz Pro | | Banknotes0 -
Redirecting duplicate .asp pages??
Hi all, I have a bit of a problem with duplicate content on our website. The CMS has been creating identical duplicate pages depending on which menu route a user takes to get to a product (i.e. via the side menu button or the top menu bar). Anyway, the web design company we use are sorting it out going forward, and creating 301 redirects on the duplicate pages. My question is, some of the duplicates take two different forms. E.g. for the home page: www.<my domain="">.co.uk
Moz Pro | | gdavies09031977
www..<my domain="">.co.uk/index.html
www.<my domain="">.co.uk/index.asp</my></my></my> Now I understand the 'index.html' page should be redirected, but does the 'index.asp' need to be directed also? What makes this more confusing is when I run the SEOMoz diagnostics report (which brought my attention to the duplicate content issue in the first place - thanks SEOMoz), not all the .asp pages are identified as duplicates. For example, the above 'index.asp' page is identified as a duplicate, but 'contact-us.asp' is not highlighted as a duplicate to 'contact-us.html'? I'm a bit new to all this (I'm not a IT specialist), so any clarification anyone can give would be appreciated. Thanks, Gareth0