Duplicate content nightmare
-
Hey Moz Community
I ran a crawl test, and there is a lot of duplicate content but I cannot work out why. It seems that when I publish a post secondary urls are being created depending on some tags and categories. Or at least, that is what it looks like. I don't know why this is happening, nor do I know if I need to do anything about it.
Help? Please.
-
You bet Russ. I appreciate what you guys do and am happy to participate and share as I can!
-
Thanks for helping out on Q/A! It is hugely helpful to have professionals answering questions here!
-
No problem!
-
I'm sure Russ has you covered - no reason for me to look into this. Just wanted to make sure you didn't think I'd forgotten you :).
-
Wow, thank you very much. I will read up on that link.
-
Hey Todd. MobileDay.com is the website and the crawl test was using the Moz tool. Crawled the entire domain.
-
Hey,
I took a peak and it looks like most of your duplicate content is coming from /tag/ issues in Wordpress. Luckily, this is very easy to solve, especially since you have Yoast SEO plugin in place.
- Log into wordpress admin for your site.
- Click on the Yoast SEO plugin in the left-hand navigation
- Choose "Titles and Meta"
- Choose "Taxonomies"
- Click "noindex, follow" under "Tags"
- Choose "Save Changes"
This will tell Google that your tags pages are useful for categorization and linking but do not contain unique content. Your other alternative is to add content to your tags pages that makes them unique. If you do a good job of creating truly useful tag pages, they can become very important parts of your SEO strategy. You need to be careful, though, so as not to create sprawling, thin-content sections of your site. Here is a decent guide on the practice.
HTH
-
Hey MobileDay -- any chance you'd be willing to share the site you're looking at so I can help diagnose the issue? Also, what tool did you use to run the crawl test? What exactly did you crawl?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content issue
I'm getting duplicate content warnings from Moz for various slideshows on my posts and pages in Wordpress. It seems when I create a slideshow it exists as its own page and as these have no text Moz sees them as duplicates. Here are some examples - http://www.weddingphotojournalist.co.uk/?gallery_page=slideshow&pp_gallery_id=1331991312
Moz Pro | | simonatkinsphoto
Moz says is a duplicate of -
http://www.weddingphotojournalist.co.uk/?gallery_page=slideshow&pp_gallery_id=1000144730 The second of those two slideshows is on this page -http://www.weddingphotojournalist.co.uk/menorca-wedding/ but also exists as the page above. How can i avoid these being seen as duplicate content?0 -
Duplicate Content
My website is hosted by Hubspot. With each blog I write I can tag them to be listed in a specific category. As an example, one blog article my have three tags or categories that it fits in. Seomoz is seeing this as a duplication of content. in other words, if you go to the different category pages the same article would be listed on all three pages, even though it is just one article. However, I only have 36 duplicate content warnings and I have 150 blog articles, each having 2 or 3 tags (categories.), so there should be many more than 36 duplications. Is this something that affects my seo, or should I just ignore the problem and check these warnings as fixed? Thanks,
Moz Pro | | Rong
Ron0 -
How does SEOmoz pull its duplicate page title and content information?
I ask because I am getting errors based on URLs that do not even exist on our site. For example: http://www.robots.com/applications/abb/panasonic/robots this URL does not even exist for our site, but somehow it is listed in the error section of page title duplication tool. http://www.robots.com/applications/ exists, but there is no place to get to an ABB or a Panasonic robot from this page, not to mention an ABB/Panasonic (which for sure does not exist). ?? We have quite a few of these out there and just wondering how to find out where the link is coming from. When we checked our URLs through Integrity, links like the one listed above (which we had 29 of them listed) that do not show up. Thoughts? Thanks! Janelle
Moz Pro | | jwanner0 -
I'm seeing duplicate links listed in open site explorer. Can anyone explain why this might be happening?
For each link listed in Open Site Explorer, I am seeing two identical entries. You can see why this might be a problem for both workflow and data accuracy. Any ideas why this might be happening? Is there a setting or filter that I'm missing?
Moz Pro | | StephenEggett0 -
Roger keeps telling me my canonical pages are duplicates
I've got a site that's brand spanking new that I'm trying to get the error count down to zero on, and I'm basically there except for this odd problem. Roger got into the site like a naughty puppy a bit too early, before I'd put the canonical tags in, so there were a couple thousand 'duplicate content' errors. I put canonicals in (programmatically, so they appear on every page) and waited a week and sure enough 99% of them went away. However, there's about 50 that are still lingering, and I'm not sure why they're being detected as such. It's an ecommerce site, and the duplicates are being detected on the product page, but why these 50? (there's hundreds of other products that aren't being detected). The URLs that are 'duplicates' look like this according to the crawl report: http://www.site.com/Product-1.aspx http://www.site.com/product-1.aspx And so on. Canonicals are in place, and have been for weeks, and as I said there's hundreds of other pages just like this not having this problem, so I'm finding it odd that these ones won't go away. All I can think of is that Roger is somehow caching stuff from previous crawls? According to the crawl report these duplicates were discovered '1 day ago' but that simply doesn't make sense. It's not a matter of messing up one or two pages on my part either; we made this site to be dynamically generated, and all of the SEO stuff (canonical, etc.) is applied to every single page regardless of what's on it. If anyone can give some insight I'd appreciate it!
Moz Pro | | icecarats0 -
How to resolve Duplicate Content crawl errors for Magento Login Page
I am using the Magento shopping cart, and 99% of my duplicate content errors come from the login page. The URL looks like: http://www.site.com/customer/account/login/referer/aHR0cDovL3d3dy5tbW1zcGVjaW9zYS5jb20vcmV2aWV3L3Byb2R1Y3QvbGlzdC9pZC8xOTYvY2F0ZWdvcnkvNC8jcmV2aWV3LWZvcm0%2C/ Or, the same url but with the long string different from the one above. This link is available at the top of every page in my site, but I have made sure to add "rel=nofollow" as an attribute to the link in every case (it is done easily by modifying the header links template). Is there something else I should be doing? Do I need to try to add canonical to the login page? If so, does anyone know how to do it using XML?
Moz Pro | | kdl01 -
How can I clean up my crawl report from duplicate records?
I am viewing my Crawl Diagnostics Report. My report is filled with data which really shouldn't be there. For example I have a page: http://www.terapvp.com/forums/Ghost/ This is a main forum page. It contains a list of many threads. The list can be sorted on many values. The page is canonicalized, and has been since it was created. My crawl report shows this page listed 15 times. http://www.terapvp.com/forums/Ghost/?direction=asc http://www.terapvp.com/forums/Ghost/?direction=desc http://www.terapvp.com/forums/Ghost/?order=post_date and so forth. Each of those pages uses the same canonicalization reference shared above. I have three questions: Why is this data appearing in my crawl report? These pages are properly canonicalized. If these pages are supposed to appear in the report for some reason, how can I remove them? My desire is to focus on any pages which may have an issue which needs to be addressed. This site has about 50 forum pages and when you add an extra 15 pages per forum, it becomes a lot harder to locate actionable data. To make matters worse, these forum indexes often have many pages. So if I have a "Corvette" forum there that is 10 pages long, then there will be 150 extra pages just for that particular forum in my crawl report. Is there anything I am missing? To the best of my knowledge everything is set up according to the best SEO practices. If there is any other opinions, I would like to hear them.
Moz Pro | | RyanKent0 -
SEOmoz Bot indexing JSON as content
Hello, We have a bunch of pages that contain local JSON we use to display a slideshow. This JSON has a bunch of<a links="" in="" it. <="" p=""></a> <a links="" in="" it. <="" p="">For some reason, these</a><a links="" that="" are="" in="" json="" being="" indexed="" and="" recognized="" by="" the="" seomoz="" bot="" showing="" up="" as="" legit="" for="" page. <="" p=""></a> <a links="" that="" are="" in="" json="" being="" indexed="" and="" recognized="" by="" the="" seomoz="" bot="" showing="" up="" as="" legit="" for="" page. <="" p="">One example page this is happening on is: http://www.trendhunter.com/trends/a2591-simplifies-product-logos . Searching for the string '<a' yields="" 1100+="" results="" (all="" of="" which="" are="" recognized="" as="" links="" for="" that="" page="" in="" seomoz),="" however,="" ~980="" these="" json="" code="" and="" not="" actual="" on="" the="" page.="" this="" leads="" to="" a="" lot="" invalid="" our="" site,="" super="" inflated="" count="" on-page="" page. <="" span=""></a'></a> <a links="" that="" are="" in="" json="" being="" indexed="" and="" recognized="" by="" the="" seomoz="" bot="" showing="" up="" as="" legit="" for="" page. <="" p="">Is this a bug in the SEOMoz bot? and if not, does google work the same way?</a>
Moz Pro | | trendhunter-1598370