Sorting Dupe Content Pages
-
Hi,
I'm no excel pro, and I'm having a bit of a challenge interpreting the Crawl Diagnostics export .csv file.
I'd like to see at a glance which of my pages (and I have many) are the worst offenders for dupe content – ie. which have the most "Other URLs" associated with them.
Thanks, would appreciate any advice on how other people are using this data, and/or how 'Moz recommends to do it.
-
CMC is correct - thats how I do it for larger sites.
- delete all columns except the URL column (col A) and the duplicate pages column (now Col B)
- in cell C2, enter this formula: =len(b2) it will calculate the characters in dupe pages cell
- drag that cell down to last row
- select all three columns and sort col c by largest to smallest
Obviously this isn't going to give you an exact number of dupe pages since URL text strings can vary in length, but it does give you a pretty good idea of the worst offenders....
-
I've found this a little frustrating, too. The display on the web will show the number of duplicate URLs, but the exported spreadsheet does not. It does, however, list all of the duplicate URLs in one cell -- so you could calculate the character length of that cell and then sort by that column, and that would give you a rough ranking.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden increase in on-page links
Within one week, the site crawl shows my on-page links going from ~10 to ~100+ Does anyone know of a rational reason for this? Sucuri shows my site is clean, so I don't think it's a hack. 10 seems too low to begin with anyway. Any ideas? Thanks 🙂
Moz Pro | | pupstar0 -
Only One page crawled..Need help
I have run a website in Seomoz which have many URLs with it. But when I saw the seomoz report that showing Pages Crawled: 1. Why this is happen my campaign limit is OK Tell me what to do for all page crawling in seomoz report. wV6fMWx
Moz Pro | | lucidsoftech0 -
Duplicate content analysis
Good morning everyone, I have just run a test from SEOmoz PRO tool and I got more than 2,000 double content errors. How might I see which are the 2 pages whose content is double? My website is just newly revamped and can't find these similarities on my own: for me there are not). Thanks for your help! Francesca
Moz Pro | | astojanov0 -
Duplicate page title
Hello my page has this Although with seomoz crawl it says that this pages has duplicate titles. If my blog has 25 pages, i have according seomoz 25 duplicate titles. Can someone tell me if this is correct or if the seomoz crawl cannot recognize rel="next" or if there is another better way to tell google when there a pages generated from the blog that as the same title Should i ignore these seomoz errors thank you,
Moz Pro | | maestrosonrisas0 -
Truncate page URLs
We have some pages (for example a contact us form) for which the URL is modified by the CMS depending on the referring page (this helps to put the form submission in context for the sales reps who get the contact submission). The SEOmoz crawler considers each URL a new page -- and so numbers like in diagnostics are all inflated as the same page is listed multiple times (e.g. for too many links) Is there a setting to change what the crawler considers to be the same page? Here are two URLs for the same page that the reports treat as separate pages: http://www.spirent.com/About-Us/Contact_us.aspx?referurl=0F528F4D703D8BB3523738D6373AA8AD http://www.spirent.com/About-Us/Contact_us.aspx?referurl=10ACDA6055244E369395223437FDCF30 The page is actually: http://www.spirent.com/About-Us/Contact_us.aspx Thanks Ken
Moz Pro | | spirent.marcom0 -
20000 site errors and 10000 pages crawled.
I have recently built an e-commerce website for the company I work at. Its built on opencart. Say for example we have a chair for sale. The url will be: www.domain.com/best-offers/cool-chair Thats fine, seomoz is crawling them all fine and reporting any errors under them url great. On each product listing we have several options and zoom options (allows the user to zoom in to the image to get a more detailed look). When a different zoom type is selected it adds on to the url, so for example: www.domain.com/best-offers/cool-chair?zoom=1 and there are 3 different zoom types. So effectively its taking for urls as different when in fact they are all one url. and Seomoz has interpreted it this way, and crawled 10000 pages(it thinks exist because of this) and thrown up 20000 errors. Does anyone have any idea how to solve this?
Moz Pro | | CompleteOffice0 -
On-Page URL
Hopefully I am missing something basic... I can't see how to specifically add and delete On-Page reports. It seems like running a report adds it but how to delete? Also, how does one change the URL for a report? I have re-organized some pages and can't seem the get the on-page report to keep my URL change. Here is what I tried. From the On-Page report card for a keyword I changed the URL and ran the test. Test runs ok but if I navigate back to the summary my old bad URL is still there.
Moz Pro | | Banknotes0 -
Should I worry about duplicate content errors caused by backslashes?
Frequently we get red-flagged for duplicate content in the MozPro Crawl Diagnostics for URLs with and without a backslash at the end. For example: www.example.com/ gets flagged as being a duplicate of www.example.com I assume that we could rel=canonical this, if needed, but our assumption has been that Google is clever enough to discount this as a genuine crawl error. Can anyone confirm or deny that? Thanks.
Moz Pro | | MackenzieFogelson0