Duplicate URL errors when URL's are unique
-
Hi All,
I'm running through MOZ analytics site crawl report and it is showing numerous duplicate URL errors, but the URLs appear to be unique. I see that the majority of the URL's are the same, but shouldn't the different brands make them unique to one another?
http://www.sierratradingpost.com/clearance~1/clothing~d~5/tech-couture~b~33328/
http://www.sierratradingpost.com/clearance~1/clothing~d~5/zobha~b~3072/
Any ideas as to why these would be shown as duplicate URL errors?
-
There is long article on the dev blog how they determine whether pages are duplicates - check https://moz.com/devblog/near-duplicate-detection/ - it's quite technical stuff - but this is the part which might interest you:
"This leads to one of the questions we get asked a lot: Why do I see duplicate content warnings in the context of Custom Crawl for pages that I see as different. Ultimately, it’s always because of the same reason: because no dechroming is done, there is a small amount of unique content relative to the total content. One of the places where this crops up a lot is web stores, where there’s a large amount of chrome layout, but only a short product description associated with it."
Dechroming : removing things like navigation, footer, ..etc from the page (exact def. to be found in the article)
If you compare both pages - apart from the image & product title there isn't too much difference between them so the crawler sees only a very small % of content which is different and marks them as duplicates.
Dirk
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hi i have a few pages with duplicate content but we've added canonical urls to them, but i need help understanding what going on
hi google is seeing many of our pages and dupliates but they have canonical url on there https://www.hijabgem.com/index.php/maxi-shirt-dress.html has tags https://www.hijabgem.com/maxi-shirt-dress.html
On-Page Optimization | | hijabgem
has tagshttps://www.hijabgem.com/index.php/quickview/index/view/id/4693
has tags
my question is which page takes authority?and are they setup correct, can you have more than one link rel="canonical" on one page?0 -
Content hidden behind a 'read all/more..' etc etc button
Hi Anyone know latest thinking re 'hidden content' such as body copy behind a 'read more' type button/link in light of John Muellers comments toward end of last year (that they discount hidden copy etc) & follow up posts on Search Engine Round Table & Moz etc etc ? Lots of people were testing it and finding such content was still being crawled & indexed so presumed not a big deal after all but if Google said they discount it surely we now want to reveal/unhide such body copy if it contains text important to the pages seo efforts. Do you think it could be the case that G is still crawling & indexing such content BUT any contribution that copy may have had to the pages seo efforts is now lost if hidden. So to get its contribution to SEO back one needs to reveal it, have fully displayed ? OR no need to worry and can keep such copy behind a 'read more' button/link ? All Best Dan
On-Page Optimization | | Dan-Lawrence0 -
Duplicate Content - But it isn't!
Hi All, I have a site that releases alerts for particular problem/events/happenings. Due to legal stuff we keep the majority of the content the same on each of these event pages. The URLs are all different but it keeps coming back as duplicate content. The canonical tag is not right (i dont think for this) egs http://www.holidaytravelwatch.com/alerts/call-to-arms/egypt/coral-sea-waterworld-resort-sharm-el-sheikh-egypt-holiday-complaints-july-2014 http://www.holidaytravelwatch.com/alerts/call-to-arms/egypt/hotel-concorde-el-salam-sharm-el-sheikh-egypt-holiday-complaints-may-2014
On-Page Optimization | | Astute-Media0 -
Duplicate content penalty
when moz crawls my site they say I have 2x the pages that I really have & they say I am being penalized for duplicate content. I know years ago I had my old domain resolve over to my new domain. Its the only thing that makes sense as to the duplicate content but would search engines really penalize me for that? It is technically only on 1 site. My business took a significant sales hit starting early July 2013, I know google did and algorithm update that did have SEO aspects. I need to resolve the problem so I can stay in business
On-Page Optimization | | cheaptubes0 -
Does Google use 302's to pass value to the target page?
Hi, I've received the below advice, is this correct? Throughout the site, the 302 (moved temporarily) status code is used for redirects, which Google will use to pass value to the target page. Is this correct? I was under the impression a 301 was used to pass value to the target page? Could someone explain the difference between a 301 and a 302, I'm not 100% sure. Thanks, Nathan
On-Page Optimization | | Heehaw0 -
Similar URLs
I'm making a site of LSAT explanations. The content is very meaningful for LSAT students. I'm less sure the urls and headings are meaningful for Google. I'll give you an example. Here are two URLs and heading for two separate pages: http://lsathacks.com/explanations/lsat-69/logical-reasoning-1/q-10/ - LSAT 69, Logical Reasoning I, Q 10 http://lsathacks.com/explanations/lsat-69/logical-reasoning-2/q10/ - LSAT 69, Logical Reasoning II, Q10 There are two logical reasoning sections on LSAT 69. For the first url is for question 10 from section 1, the second URL is for question 10 from the second LR section. I noticed that google.com only displays 23 urls when I search "site:http://lsathacks.com". A couple of days ago it displayed over 120 (i.e. the entire site). 1. Am I hurting myself with this structure, even if it makes sense for users? 2. What could I do to avoid it? I'll eventually have thousands of pages of explanations. They'll all be very similar in terms of how I would categorize them to a human, e.g. "LSAT 52, logic games question 12" I should note that the content of each page is very different. But url, title and h1 is similar. Edit: I could, for example, add a random keyword to differentiate titles and urls (but not H1). For example: http://lsathacks.com/explanations/lsat-69/logical-reasoning-2/q10-car-efficiency/ LSAT 69, Logical Reasoning I, Q 10, Car efficiency But the url is already fairly long as is. Would that be a good idea?
On-Page Optimization | | graemeblake0 -
Duplicate Title question
Thanks Mozzers in advance for any insight into what I'm sure is a basic SEO question. I'm working with a resort in the great state of Maine. Their home page title reads Maine Resorts, Resorts in Maine, (company name). The site has about 400 URL's and over half of the URL's utilize the first keyword phrase of the home page title, "Maine Resorts." Predominately, I find them used on the Accommodations pages (pages that describe each room with a picture) which I would label as deeper pages and non-conversion type pages. The page titles themselves are not exact duplicates of the Home Page Title but might read something like "Maine Resorts, Company Name, Accommodation Listing." My concern is that the heavy use of "Maine Resorts" as the first phrase in over 200 plus pages might be competing against the home page and pulling the home page ranking down. Thanks for any help given!
On-Page Optimization | | hawkvt10 -
What's the best practice for implementing a "content disclaimer" that doesn't block search robots?
Our client needs a content disclaimer on their site. This is a simple "If you agree to these rules then click YES if not click NO" and you're pushed back to the home page. I have this gut feeling that this may cause an upset with the search robots. Any advice? R/ John
On-Page Optimization | | TheNorthernOffice790