Duplicate Content
-
The crawl shows a lot of duplicate content on my site. Most of the urls its showing are categories and tags (wordpress).
so what does this mean exactly? categories is too much like other categories? And how do i go about fixing this the best way.
thanks
-
Greg
Thanks so much for helping out! If you don't mind I'm just going to correct a few finer details so people don't confuse anything
"Essentially the tags display the exact content as the original URL so the pages are identical but the URL is different."
Its totally true that this happens, but this is not what causes the duplicate content error in the crawl report. The errors are usually from sub-pages of any given tag archive having the same title tag.
"Remove the tags"
By this I'm sure you just mean noindex tags. You don't need to remove them from the site altogether, just remove them from the index.
"If you want the Tags and Categories for user experience, Install Yoast SEO plugin which allows you to insert a canonical URL on the duplicate category pages."
You should leave categories indexed and noindex tags. Yoast does canonicals no matter what, you don't need to think about them and they are not what handles duplicate category pages.
Everything else stated is more or less ok but I just don't people to be confused.
Thanks again!
-Dan
-
Justin
Sorry to hear of your trouble with making the new settings. For one, my guide on SEOmoz about setting up WordPress for SEO should be helpful. I'd recommend familiarizing yourself with that.
In these cases - the "duplicate content" is usually not the page its self but rather usually just the title tags.
This is because, imagine you have tag archives like this;
- mydomain.com/tag/pink-elephants/
- mydomain.com/tag/pink-elephants/page/2/
- mydomain.com/tag/pink-elephants/page/3/
Usually the title tags respectably end up being the same;
- Pink Elephants | My Domain
- Pink Elephants | My Domain <-- title tag for page 2
- Pink Elephants | My Domain <-- title tag for page 3
For every single tag "subpage".
Normally, the protocol would be to;
- Noindex subpages
- Noindex tags
- Noindex dated archives
- Disable author archives (single author blog only)
- Index categories
You can still link to tag pages and use tags within the site all you want, but you just don't want to index them.
These are just default settings. Its impossible to know exactly what you should be doing without seeing your site, but I hope all of that gets you in the right direction!
-Dan
-
You should only no-follow your tags and archives and not your categories...
In the plugin settings, under permalinks, there is an option
"Strip the category base (usually
/category/
) from the category URL." this will just stop the duplicate pages from appearing,Blocking the category's must have caused the drop.
Greg
-
Changed to Yoast. I ticked no follow on archives, categories, and tags. One hour later, website went from #7 to page four.
-
Well, the duplicate content is causing issues alone.. Google does not like duplicate pages at all...
If you select which are your primary pages, and tell google to ignore the rest, it can only help your ranking.
With the Yoast SEO plugin, all you need to do is set tags to no-follow and no-index, and also strip the category from the URL. (it redirects automatically, as well)
Greg
-
Thanks for the reply. Would this affect ranking or can it be left alone ?
-
Wordpress does this when you use tags....
Essentially the tags display the exact content as the original URL so the pages are identical but the URL is different.
2 Options that i can think of.
1.) Remove the tags and strip the category segment in the URL and stop using them in future. This will require redirects from duplicate URL"s to the main article (this will take planning, allot of time and is quite complicated)
2.) If you want the Tags and Categories for user experience, Install Yoast SEO plugin which allows you to insert a canonical URL on the duplicate category pages. This tells Google were the original page can be found. Tags are only their for user experience so you can set these to no-follow and no-index.
Greg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to protect against duplicate content?
I just discovered that my company's 'dev website' (which mirrors our actual website, but which is where we add content before we put new content to our actual website) is being indexed by Google. My first thought is that I should add a rel=canonical tag to the actual website, so that Google knows that this duplicate content from the dev site is to be ignored. Is that the right move? Are there other things I should do? Thanks!
Technical SEO | | williammarlow0 -
Tips and duplicate content
Hello, we have a search site that offers tips to help with search/find. These tips are organized on the site in xml format with commas... of course the search parameters are duplicated in the xml so that we have a number of tips for each search parameter. For example if the parameter is "dining room" we might have 35 pieces of advice - all less than a tweet long. My question - will I be penalized for keyword stuffing - how can I avoid this?
Technical SEO | | acraigi0 -
Duplicate Content on SEO Pages
I'm trying to create a bunch of content pages, and I want to know if the shortcut I took is going to penalize me for duplicate content. Some background: we are an airport ground transportation search engine(www.mozio.com), and we constructed several airport transportation pages with the providers in a particular area listed. However, the problem is, sometimes in a certain region multiple of the same providers serve the same places. For instance, NYAS serves both JFK and LGA, and obviously SuperShuttle serves ~200 airports. So this means for every airport's page, they have the super shuttle box. All the provider info is stored in a database with tags for the airports they serve, and then we dynamically create the page. A good example follows: http://www.mozio.com/lga_airport_transportation/ http://www.mozio.com/jfk_airport_transportation/ http://www.mozio.com/ewr_airport_transportation/ All 3 of those pages have a lot in common. Now, I'm not sure, but they started out working decently, but as I added more and more pages the efficacy of them went down on the whole. Is what I've done qualify as "duplicate content", and would I be better off getting rid of some of the pages or somehow consolidating the info into a master page? Thanks!
Technical SEO | | moziodavid0 -
Dealing with duplicate content
Manufacturer product website (product.com) has an associated direct online store (buyproduct.com). the online store has much duplicate content such as product detail pages and key article pages such as technical/scientific data is duplicated on both sites. What are some ways to lessen the duplicate content here? product.com ranks #1 for several key keywords so penalties can't be too bad and buyproduct.com is moving its way up the SERPS for similar terms. Ideally I'd like to combine the sites into one, but not in the budget right away. Any thoughts?
Technical SEO | | Timmmmy0 -
What to do about similar content getting penalized as duplicate?
We have hundreds of pages that are getting categorized as duplicate content because they are so similar. However, they are different content. Background is that they are names and when you click on each name it has it's own URL. What should we do? We can't canonical any of the pages because they are different names. Thank you!
Technical SEO | | bonnierSEO0 -
Duplicate content
Greetings! I have inherited a problem that I am not sure how to fix. The website I am working on had a 302 redirect from its original home url (with all the link juice) to a newly designed page (with no real link juice). When the 302 redirect was removed, a duplicate content problem remained, since the new page had already been indexed by google. What is the best way to handle duplicate content? Thanks!
Technical SEO | | shedontdiet0 -
Complex duplicate content question
We run a network of three local web sites covering three places in close proximity. Each sitehas a lot of unique content (mainly news) but there is a business directory that is shared across all three sites. My plan is that the search engines only index the business in the directory that are actually located in the place the each site is focused on. i.e. Listing pages for business in Alderley Edge are only indexed on alderleyedge.com and businesses in Prestbury only get indexed on prestbury.com - but all business have a listing page on each site. What would be the most effective way to do this? I have been using rel canonical but Google does not always seem to honour this. Will using meta noindex tags where appropriate be the way to go? or would be changing the urls structure to have the place name in and using robots.txt be a better option. As an aside my current url structure is along the lines of: http://dev.alderleyedge.com/directory/listing/138/the-grill-on-the-edge Would changing this have any SEO benefit? Thanks Martin
Technical SEO | | mreeves0 -
E-Commerce Duplicate Content
Hello all We have an e-commerce website with approximately 3,000 products. Many of the products are displayed in multiple categories which in turn generates a different URL! 😞 Accross the entire site I have noticed that the product pages are always outranked by competitors who have lower page authority, domain authority, total links etc etc. I am convinced this is down to duplicate content issues. I understand there is no direct penalty but how would this affect our rankings? Is page rank split between all the duplicates, which in turn lowers it's ranking potential? I have looked for a way to identify duplicate content using Google analytics but i've been unsuccessful. If the duplicate content is the issue and page rank is divided am i best using canonical or 301 redirects? Sorry if this is an obvious question but If i'm correct we could see a huge improvement in rankings accross the board. Wow! Cheers Todd
Technical SEO | | toddyC0