Duplicate content warning: Same page but different urls???
-
Hi guys i have a friend of mine who has a site i noticed once tested with moz that there are 80 duplicate content warnings, for instance
Page 1 is http://yourdigitalfile.com/signing-documents.html
the warning page is http://www.yourdigitalfile.com/signing-documents.html
another example
Page 1 http://www.yourdigitalfile.com/
same second page http://yourdigitalfile.com
i noticed that the whole website is like the nealry every page has another version in a different url?, any ideas why they dev would do this, also the pages that have received the warnings are not redirected to the newer pages you can go to either one???
thanks very much
-
Thanks Tim. Do you have any examples of what those problems might be? With such a large catalog managing those rel canonical tags will be difficult (I don't even know if the store allows them, it's a hosted store solution and little code customization is allowed).
-
Hi there AspenFasteners, in this instance rather than a .HTAccess rule I would suggest applying a rel canonical tag which points to the page you deem as the original master source.
Using the robots to try and hide things could potentially cause you more issues as your categories may struggle to be indexed correctly.
-
We have a similar problem, but much more complex to handle as we have a massive catalog of 80,000 products and growing.
The problem occurs legitimately because our catalog is so large that we offer different navigation paths to the same content.
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8314.htm
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8315.htm
(If you look at the "You are here" breadcrumb trail, you will see the subtle differences in the navigation paths, with 8314.htm, the user went through Home > Screws, with 8315.htm, via Home > Security Fasteners > Screws).
Our hosted web store does not offer us htaccess, so I am thinking of excluding the redundant navigation points via robots.txt.
My question: is there any reason NOT to do this?
-
Oh ok
The only reason i was thinking it is duplicate content is the warnings i got on the moz crawl, see below.
75 Duplicate Page Content
6 4xx Client Error
5 Duplicate Page Title
44 Missing Meta Description Tag
5 Title Element is Too Short
I have found over 80 typos, grammatical errors, punctuation errors and incorrect information which was leading me to believe the quality of the work and their attention to detail was rather bad, which is why i thought this was a possibility.
Thanks again for your time
its really appreciated
-
I wouldn't say that they have created two pages, it is just that because you have two versions of the domain and not set a preferred version that you are getting it indexing twice. .HTaccess changes are under the hood of the website and could have simply been an oversight.
-
Hey Tim
Thanks for your answer. It's really weird, other than lazyness on the devs part not to remove old or previous versions of pages?, have you any idea why they would create multiple versions of the same page with different url's?? is there any legit reason like ones severs mobile or something??
Just wondering
thanks for replying
-
OK, so in this instance the only issue you have is that you need to choose your preferred start point - www or non www.
I would add a bit of code to your htaccess file to point to your preferred choice. I personally prefer a www. domain. Something like the below would work.
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]As your site is already indexed I would also for the time being and as more of a safety measure add canonicals to the pages that point to the www. version of your site.
Also if you have a Google Search Console account, you can select your prefered domain prefix in there. this will again help with your indexation.
Hopefully I have covered most things.
Cheers
Tim
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is this campaign of spammy links to non-existent pages damaging my site?
My site is built in Wordpress. Somebody has built spammy pharma links to hundreds of non-existent pages. I don't know whether this was inspired by malice or an attempt to inject spammy content. Many of the non-existent pages have the suffix .pptx. These now all return 403s. Example: https://www.101holidays.co.uk/tazalis-10mg.pptx A smaller number of spammy links point to regular non-existent URLs (not ending in .pptx). These are given 302s by Wordpress to my homepage. I've disavowed all domains linking to these URLs. I have not had a manual action or seen a dramatic fall in Google rankings or traffic. The campaign of spammy links appears to be historical and not ongoing. Questions: 1. Do you think these links could be damaging search performance? If so, what can be done? Disavowing each linking domain would be a huge task. 2. Is 403 the best response? Would 404 be better? 3. Any other thoughts or suggestions? Thank you for taking the time to read and consider this question. Mark
White Hat / Black Hat SEO | | MarkHodson0 -
Page drops from index completely
We have a page that is ranking organically at #1 but over the past couple of months the page has twice dropped from a search term entirely. There don't appear to be any issues with the page in Search Console and adding the page on https://www.google.com/webmasters/tools/submit-url seems to fix the issue. The search term we're tracking that drops is in the URL for the page and is the h1 of the page. Here is a screenshot of the ranking over the past few months: https://jmp.sh/akvaKGF What could cause this to happen? There is nothing in search console that shows any problems with the page. The last time this happened the page completely dropped on all search terms and showed up again after submitting the url to google manually. This time it dropped on just one search term and reappeared the next day after manually submitting the page again. akvaKGF
White Hat / Black Hat SEO | | russell_ms0 -
How to improve PA of Shortened URLs
Why some of shortened urls like bitly/owly/googl has PA>40? I tried everything to improve PA of my shortened urls like facebook shares, retweets and backlinks to them but still i have PA-1. Checkout this URL: https://mza.bundledseo.com/blog/state-of-links in MOZ OSE and you will many 301 links from shortners
White Hat / Black Hat SEO | | igains
I asked many seo experts about this but no one answered this question so today subscribed MOZ pro for the solution. Please give me the answer.0 -
What can I put on a 404 page?
When it comes to SEO what can I put on a 404 page? I want to add content that actually makes the page useful so visitors will more likely stay on the website. Most pages just have a big image of 404 and a couple sentences saying what happened. I am wondering if Google would like if there was blog suggestions or navigational functions?
White Hat / Black Hat SEO | | JoeyGedgaud0 -
How can I 100% safe get some of my keywords ranking on second & third page?
Hi, I want to know how can I rank some of my keywords which are in the second and third page on google on page one 100% save, so it will pass all penguin, pandas etc as quick as possible? Kind Regards
White Hat / Black Hat SEO | | rodica70 -
Glossary pages - keyword stuffing danger?
I've put together a glossary of terms related to my industry that have SEO value and am planning on building out a section on our site with unique pages for each term. However, most of these terms have synonyms or are highly similar to other valuable terms. If I were to make a glossary, and on each page (that will have high-quality, valuable, and accurate definitions and more), wrote something like "{term}, also commonly referred to as {synonym}, {synonym}," would I run the risk of keyword stuffing penalties? My only other idea beyond creating a glossary with separate pages defining each synonym is to use schema.org markup to add synonyms to the HTML of the page, but that could be seen as even more grey-hat type keyword stuffing. I guess one other option would be to work the synonyms into the definition so that the presence of the keyword reads more organically. Thanks!
White Hat / Black Hat SEO | | alecfwilson0 -
Looking for a Way to Standardize Content for Thousands of Pages w/o Getting Duplicate Content Penalties
Hi All, I'll premise this by saying that we like to engage in as much white hat SEO as possible. I'm certainly not asking for any shady advice, but we have a lot of local pages to optimize :). So, we are an IT and management training course provider. We have 34 locations across the US and each of our 34 locations offers the same courses. Each of our locations has its own page on our website. However, in order to really hone the local SEO game by course topic area and city, we are creating dynamic custom pages that list our course offerings/dates for each individual topic and city. Right now, our pages are dynamic and being crawled and ranking well within Google. We conducted a very small scale test on this in our Washington Dc and New York areas with our SharePoint course offerings and it was a great success. We are ranking well on "sharepoint training in new york/dc" etc for two custom pages. So, with 34 locations across the states and 21 course topic areas, that's well over 700 pages of content to maintain - A LOT more than just the two we tested. Our engineers have offered to create a standard title tag, meta description, h1, h2, etc, but with some varying components. This is from our engineer specifically: "Regarding pages with the specific topic areas, do you have a specific format for the Meta Description and the Custom Paragraph? Since these are dynamic pages, it would work better and be a lot easier to maintain if we could standardize a format that all the pages would use for the Meta and Paragraph. For example, if we made the Paragraph: “Our [Topic Area] training is easy to find in the [City, State] area.” As a note, other content such as directions and course dates will always vary from city to city so content won't be the same everywhere, just slightly the same. It works better this way because HTFU is actually a single page, and we are just passing the venue code to the page to dynamically build the page based on that venue code. So they aren’t technically individual pages, although they seem like that on the web. If we don’t standardize the text, then someone will have to maintain custom text for all active venue codes for all cities for all topics. So you could be talking about over a thousand records to maintain depending on what you want customized. Another option is to have several standardized paragraphs, such as: “Our [Topic Area] training is easy to find in the [City, State] area. Followed by other content specific to the location
White Hat / Black Hat SEO | | CSawatzky
“Find your [Topic Area] training course in [City, State] with ease.” Followed by other content specific to the location Then we could randomize what is displayed. The key is to have a standardized format so additional work doesn’t have to be done to maintain custom formats/text for individual pages. So, mozzers, my question to you all is, can we standardize with slight variations specific to that location and topic area w/o getting getting dinged for spam or duplicate content. Often times I ask myself "if Matt Cutts was standing here, would he approve?" For this, I am leaning towards "yes," but I always need a gut check. Sorry for the long message. Hopefully someone can help. Thank you! Pedram1 -
Article Re-posting / Duplication
Hi Mozzers! Quick question for you all. This is something I've been unsure of for a while. But when a guest post you've written goes live on someone's blog. Is it then okay it post the same article to your own blog as well as Squidoo for example? Would the search engines still see it as duplication if I have a link back to the original?
White Hat / Black Hat SEO | | Webrevolve0