Duplicate content warning: Same page but different urls???
-
Hi guys i have a friend of mine who has a site i noticed once tested with moz that there are 80 duplicate content warnings, for instance
Page 1 is http://yourdigitalfile.com/signing-documents.html
the warning page is http://www.yourdigitalfile.com/signing-documents.html
another example
Page 1 http://www.yourdigitalfile.com/
same second page http://yourdigitalfile.com
i noticed that the whole website is like the nealry every page has another version in a different url?, any ideas why they dev would do this, also the pages that have received the warnings are not redirected to the newer pages you can go to either one???
thanks very much
-
Thanks Tim. Do you have any examples of what those problems might be? With such a large catalog managing those rel canonical tags will be difficult (I don't even know if the store allows them, it's a hosted store solution and little code customization is allowed).
-
Hi there AspenFasteners, in this instance rather than a .HTAccess rule I would suggest applying a rel canonical tag which points to the page you deem as the original master source.
Using the robots to try and hide things could potentially cause you more issues as your categories may struggle to be indexed correctly.
-
We have a similar problem, but much more complex to handle as we have a massive catalog of 80,000 products and growing.
The problem occurs legitimately because our catalog is so large that we offer different navigation paths to the same content.
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8314.htm
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8315.htm
(If you look at the "You are here" breadcrumb trail, you will see the subtle differences in the navigation paths, with 8314.htm, the user went through Home > Screws, with 8315.htm, via Home > Security Fasteners > Screws).
Our hosted web store does not offer us htaccess, so I am thinking of excluding the redundant navigation points via robots.txt.
My question: is there any reason NOT to do this?
-
Oh ok
The only reason i was thinking it is duplicate content is the warnings i got on the moz crawl, see below.
75 Duplicate Page Content
6 4xx Client Error
5 Duplicate Page Title
44 Missing Meta Description Tag
5 Title Element is Too Short
I have found over 80 typos, grammatical errors, punctuation errors and incorrect information which was leading me to believe the quality of the work and their attention to detail was rather bad, which is why i thought this was a possibility.
Thanks again for your time
its really appreciated
-
I wouldn't say that they have created two pages, it is just that because you have two versions of the domain and not set a preferred version that you are getting it indexing twice. .HTaccess changes are under the hood of the website and could have simply been an oversight.
-
Hey Tim
Thanks for your answer. It's really weird, other than lazyness on the devs part not to remove old or previous versions of pages?, have you any idea why they would create multiple versions of the same page with different url's?? is there any legit reason like ones severs mobile or something??
Just wondering
thanks for replying
-
OK, so in this instance the only issue you have is that you need to choose your preferred start point - www or non www.
I would add a bit of code to your htaccess file to point to your preferred choice. I personally prefer a www. domain. Something like the below would work.
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]As your site is already indexed I would also for the time being and as more of a safety measure add canonicals to the pages that point to the www. version of your site.
Also if you have a Google Search Console account, you can select your prefered domain prefix in there. this will again help with your indexation.
Hopefully I have covered most things.
Cheers
Tim
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Restructuring URLS - unsure if this falls on the spammy side of paths.
Hi all, I'm restructuring a site that has been built with no real structure. It's moving over to HTTPS and having a full new development so it's a good time to tackle it all together. It's a snowboard site and at the moment the courses, camps ect are all just as pages like: examplesnowboarding.com/off-piste-backcountry/ I'm wanting to tighten the structure so it gives more meaning to the pages and so I can style them selectively and make it easier for the client to manage but I'm worried repeating the word snowboard too often will look spammy. I'm wanting to do the following: URL - examplesnowboarding.com/snowboard-courses/splitboard-backcountry-intro/
White Hat / Black Hat SEO | | snowflake74
URL - examplesnowboarding.com/snowboard-camps/technical-performance/
URL - examplesnowboarding.com/snowboard-camps/girls-only/
URL - examplesnowboarding.com/snowboard-lessons/private/
URL - examplesnowboarding.com/snowboard-lessons/group/ The urls are clean and humanly descriptive but it does mean that the "snowboard" keyword is used a lot! The other 2 options I thought of were like so (including snowboard in the page name not path) URL - examplesnowboarding.com/courses/snowboard-splitboard-backcountry-intro/
URL - examplesnowboarding.com/camps/snowboard-technical-performance/
URL - examplesnowboarding.com/camps/snowboard-girls-only/
URL - examplesnowboarding.com/lessons/private-snowboard/
URL - examplesnowboarding.com/lessons/group-snowboard/ or simply removing "snowboard" as "snowboarding" is already in the main url URL - examplesnowboarding.com/courses/splitboard-backcountry-intro/
URL - examplesnowboarding.com/camps/technical-performance/
URL - examplesnowboarding.com/camps/girls-only/
URL - examplesnowboarding.com/lessons/private/
URL - examplesnowboarding.com/lessons/group/ Any thoughts appreciated!1 -
Different site behind the flag
Hello, I am in a very complicated situation. I have a site in Itaian which is targeted in Italy by webmaster tools so the majority of the organic traffic comes from there and everything is fine. However this site got a link from a major international site. So now I get traffic from all over the world but I can't take advantage of it. From the Italian traffic I get from this site I see high pageviews numbers and many minutes in average visitor time. The problem in this situation is that for many reasons this website cannot be translated so I can put many language choices in this site. I want to ask, If I put, let's say an English flag in top of my site, that will indicate the English language, but instead of the user to see an English version of the site he/she will be redirected(no follow link) to another site of the same content in English, will this violate any of Google's guideline or hurt the seo of the original site? Thank you all!
White Hat / Black Hat SEO | | Tz_Seo0 -
Duplicate Content for e-commerce help
Hi. I know I have duplicate content issues and Moz has shown me the issues on ecommerce websites. However a large number of these issues are for variations of the same product. For example a blue, armani t-shirt can be found on armani page, t-shirt page, armani t-shirt page and it also shows links for the duplicates due to sizing variations. Is it possible or even worthwhile working on these issues? Thanks
White Hat / Black Hat SEO | | YNWA0 -
On-site duplication working - not penalised - any ideas?
I've noticed a website that has been set up with many virtually identical pages. For example many of them have the same content (minimal text, three video clips) and only the town name varies. Surely this is something that Google would be against? However the site is consistently ranking near the top of Google page 1, e.g. http://www.maxcurd.co.uk/magician-guildford.html for "magician Guildford", http://www.maxcurd.co.uk/magician-ascot.html for "magician Ascot" and so on (even when searching without localisation or personalisation). For years I've heard SEO experts say that this sort of thing is frowned on and that they will get penalised, but it never seems to happen. I guess there must be some other reason that this site is ranked highly - any ideas? The content is massively duplicated and the blog hasn't been updated since 2012 but it is ranking above many established older sites that have lots of varied content, good quality backlinks and regular updates. Thanks.
White Hat / Black Hat SEO | | MagicianUK0 -
Does Trade Mark in URL matter to Google
Hello community! We are planning to clean up TM and R in the URLs on the website. Google has indexed these pages but some TM pages are have " " " instead displaying in URL from SERP. What's your thoughts on a "spring cleaning" effort to remove all TM and R and other unsafe characters in URLs? Will this impact indexed pages and ranking etc? Thank you! b.dig
White Hat / Black Hat SEO | | b.digi0 -
What is the difference between the two rewrite rules in htaccess?
Force www. prefix in URLs and redirect non-www to www RewriteCond %{HTTP_HOST} !^www.domain.com.ph
White Hat / Black Hat SEO | | esiow2013
RewriteRule (.*) http://www.domain.com.ph/$1 [R=301,L] Force www. prefix in URLs and redirect non-www to www - 2nd option RewriteCond %{HTTP_HOST} ^domain.com.ph [NC]
RewriteRule (.*) http://www.domain.com.ph/$1 [R=301,L]0 -
href="#" and href="javascript.void()" links. Is there a difference SEO wise?
I am currently working a site re-design and we are looking at if href="#" and href="javascript.void()" have an impact on the site? We were initially looking at getting the links per page down but I am thinking that rel=nofollow is the best method for this. Anyone had any experience with this? Thanks in advanced
White Hat / Black Hat SEO | | clickermediainc0 -
Syndicated content outperforming our hard work!
Our company (FindMyAccident) is an accident news site. Our goal is to roll our reporting out to all 50 states; currently, we operate full-time in 7 states. To date, the largest expenditure is our writing staff. We hire professional
White Hat / Black Hat SEO | | Wayne76
journalists who work with police departments and other sources to develop written
content and video for our site. Our visitors also contribute stories and/or
tips that add to the content on our domain. In short, our content/media is 100% original. A site that often appears alongside us in the SERPs in the markets where we work full-time is accidentin.com. They are a site that syndicates accident news and offers little original content. (They also allow users to submit their own accident stories, and the entries index quickly and are sometimes viewed by hundreds of people in the same day. What's perplexing is that these entries are isolated incidents that have little to no media value, yet they do extremely well.) (I don't rest my bets with Quantcast figures, but accidentin does use their pixel sourcing and the figures indicate that they are receiving up to 80k visitors a day in some instances.) I understand that it's common to see news sites syndicate from the AP, etc., and traffic accident news is not going to have a lot of competition (in most instances), but the real shocker is that accidentin will sometimes appear as the first or second result above the original sources??? The question: does anyone have a guess as to what is making it perform so well? Are they bound to fade away? While looking at their model, I'm wondering if we're not silly to syndicate news in the states where we don't have actual staff? It would seem we could attract more traffic by setting up syndication in our vacant states. OR Is our competitor's site bound to fade away? Thanks, gang, hope all of you have a great 2013! Wayne0