Multiple Instances of the Same Article
-
Hi, I'm having a problem I cannot solve about duplicate article postings.
As you will see from the attached images, I have a page with multiple variants of the same URL in google index and as well as duplicate title tag in the search console of webmasters tools. Its been several months I have been using canonical meta tags to resolve the issue, aka declare all variants to point to a single URL, however the problem remains. Its not just old articles that stay like that, even new articles show the same behaviour right when they are published even thought they are presented correctly with canonical links and sitemap as you will see from the example bellow.
Example URLs of the attached Image
-
All URLs belonging to the same article ID, have the same canonical link inside the html head.
-
Also because I have a separate mobile site, I also include in every desktop URL an "alternate" link to the mobile site.
-
At the Mobile Version of the Site, I have another canonical link, pointing back to the original Desktop URL. So the mobile site article version also has
-
Now, when it comes to the xml sitemap, I pass only the canonical URL and none of the other possible variants (to avoid multiple indexing), and I also point to the mobile version of the article.
<url><loc>http://www.neakriti.gr/?page=newsdetail&DocID=1300357</loc>
<xhtml:link rel="alternate" media="only screen and (max-width: 640px)" href="http://mobile.neakriti.gr/fullarticle.php?docid=1300357"><lastmod>2016-02-20T21:44:05Z</lastmod>
<priority>0.6</priority>
<changefreq>monthly</changefreq>
image:imageimage:lochttp://www.neakriti.gr/NewsASSET/neakriti-news-image.aspx?Doc=1300297</image:loc>
image:titleΟΦΗ</image:title></image:image></xhtml:link></url>
The above Sitemap snippet Source: http://www.neakriti.gr/WebServices/sitemap.aspx?&year=2016&month=2
The main sitemap of the website: http://www.neakriti.gr/WebServices/sitemap-index.aspxDespite my efforts you see that webmasters tools reports three variants for the desktop URL, and google search reports 4 URLs (3 different desktop variant urls and the mobile url).
I get this when I type the article code to see if what is indexed in google search: site:neakriti.gr 1300297
So far I believe I have done all I could in order to resolve the issue by addressing canonical links and alternate links, as well as correct sitemap.xml entry. I don't know what else to do... This was done several months ago and there is absolutelly no improvement.
Here is a more recent example of an article added 5 days ago (10-April-2016), just type
site:neakriti.gr 1300357
at google search and you will see the variants of the same article in google cache. Open the google cached page, and you will see the cached pages contain canonical link, but google doesn't obey the direction given there.Please help!
-
-
Hi all,
sorry for the delay, I am away on a business trip, this is why I stopped communicating the past few days.
I can confirm that the latest entries (those after March) come as a single instance.
However there are some minor exceptions like the one hereExample of a recent article indexed in both desktop (even though desktop url is not the canonical) and mobile URL
https://www.google.gr/search?q=site:neakriti.gr&biw=1527&bih=899&source=lnms&sa=X&ved=0ahUKEwiIxODGt5_MAhUsKpoKHdcUAkYQ_AUIBigA&dpr=1.1#q=site:neakriti.gr+1315539&tbs=qdr:w&filter=0Also I noticed that with the "alternate" and "canonical" links the mobile version of the site doesn't get indexed anymore (with minor exceptions like the one above).
-
Hi Ioannis!
How's this going? We'd love an update.
-
Hmm, interestingly, when I followed your link, I only saw the canonical version of the article. Is this what you're seeing now?
Also, in response to your earlier question, yes, you can disallow parameters with robots.txt. If these canonical issues continue, that may be the best next step.
-
Thank you for your response, I will take a look at this.
However I have two questions regarding your suggestion
- Since I have canonical links at the loading page, doesn't that resolve the issue?
- the printerfriendly variation has a noindex meta at the head, shouldn't that be taken into account?
- Can I put regular expressions in my robots.txt? How can I block url params? Because printerfriendly and newsdetailsports are values of the "page" GET param
Infact the printerfriendly contains canonical link and noindex meta to inform search engines not to index content, and let them know where the original content exists
-
Hi there
The printer friendly URL is coming from the print this article button (attached) and the /default.aspx URL is coming from the ^ TOP button (attached).
What you could do is use your robots.txt to ignore these URLs. You can all tell Google what URL parameters to ignore, but please be EXTREMELY careful doing this. It's not a fine comb tool, not a hatchet.
Let me know if you have any questions or comments, good luck!
Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rel=Canonical Vs. 301 for blog articles
Over the last few years, my company has acquired numerous different companies -- some of which were acquired before that. Some of the products acquired were living on their previous company's parent site vs. having their own site dedicated to the product. The decision has been made that each product will have their own site moving forward. Since the product pages, blog articles and resource center landing pages (ex. whitepapers LPs) were living on the parent site, I'm struggling with the decision to 301 vs. rel=canonical those pages (with the new site being self canonicaled). I'm leaning toward take-down and 301 since rel=canonicals are simply suggestions to Google and a new domain can get all the help it can to start ranking. Are there any cons to doing so?
Intermediate & Advanced SEO | | mfcb0 -
Ridding of taxonomies, so that articles enhance related page's value
Hello, I'm developing a website for a law firm, which offers a variety of services. The site will also feature a blog, which would have similarly-named topics. As is customary, these topics were taxonomies. But I want the articles to enhance the value of the service pages themselves and because the taxonomy url /category/divorce has no relationship to the actual service page url /practice-areas/divorce, I'm worried that if anything, a redundantly-titled taxonomy url would dilute the value of the service page it's related to. Sure, I could show some of the related posts on the service page but if I wanted to view more, I'm suddenly bounced over to a taxonomy page which is stealing thunder away from the more important service page. So I did away with these taxonomies all together, and posts are associatable with pages directly with a custom db table. And now if I visit the blog page, instead of a list of category terms, it would technically be a list of the service pages and so if a visitor clicks on a topic they are directed to /practice-areas/divorce/resources (the subpages are created dynamically) and the posts are shown there. I'll have to use custom breadcrumbs to make it all work. Just wondering if you guys had any thoughts on this. Really appreciate any you might have and thanks for reading
Intermediate & Advanced SEO | | utopianwp0 -
JSON-LD With Multiple @type?
I'm working with an organization that is equal parts restaurant, hotel, and conference center. How should I place JSON-LD script on their website? I want to use @type specifications of restaurant, hotel, and meeting room because all those aspects of the organization function separately. What's the best way to go about this? Should I place three different scripts on the homepage or place each script on the individual pages dedicated to dining, lodging, conference center?
Intermediate & Advanced SEO | | Campaignium0 -
Does Google penalise in the way described in this article?
In an interesting article from January on content cannibalisation: https://ninjaoutreach.com/content-cannibalization-avoid/ there is the following paragraph: "When the same keyword is used across a number of pages of a single website, Google’s spiders automatically get directed to a page with low-grade quality which in turn results in the low ranking of all the pages on the website." Is this true? The suggestion here is that they automatically get directed there as a form of penalty. This seems like quite an extraordinary claim! Can anyone verify?
Intermediate & Advanced SEO | | Ad-Rank0 -
How to structure articles on a website.
Hi All, Key to a successful website is quality content - so the Gods of Google tell me. Embrace your audience with quality feature rich articles on your products or services, hints and tips, how to, etc. So you build your article page with all the correct criteria; Long Tail Keyword or phrases hitting the URL, heading, 1st sentance, etc. My question is this
Intermediate & Advanced SEO | | Mark_Ch
Let's say you have 30 articles, where would you place the 30 articles for SEO purposes and user experiences. My thought are:
1] on the home page create a column with a clear heading "Useful articles" and populate the column with links to all 30 articles.
or
2] throughout your website create link references to the articles as part of natural information flow.
or
3] Create a banner or impact logo on the all pages to entice your audience to click and land on dedicated "articles page" Thanks Mark0 -
Should I literally delete all the articles I published in 2010/2011?
We became a charity in December and redirected everything from resistattack.com to resistattack.org. Both sites weren't up at the same time, we just switched over. However, GWT still shows the .com as a major backlinker to the .org. Why? More importantly, our site just got hit for the first time by an "unnatural link" penalty according to GWT. Our traffic dropped 70% overnight. This appeared shortly after a friend posted a sidewide link from his site that suddenly sent 10,000 links to us. I figured that was the problem, so I asked him to remove the links (he has) and submitted a reconsideration request. Two weeks later, Google refused, saying.. "We've reviewed your site and we still see links to your site that violate our quality guidelines. Specifically, look for possibly artificial or unnatural links pointing to your site that could be intended to manipulate PageRank. Examples of unnatural linking could include buying links to pass PageRank or participating in link schemes." We haven't done any "SEO link building" for two years now, but we used to publish a lot of articles to ezinearticles and isnare back in 2010/2011. They were picked up and linked from hundreds of spammy sites of course, none of which we had anything to do with. They are still being taken and new backlinks created. I just downloaded GWT latest backlinks and it's a nightmare of crappy article sites. Should I delete everything from EZA/isnare and close my account? Or just wait longer for the 10,000 links to be crawled and removed from my friends site? What do I need to do about the spammy article sites? Disavow tool or just ignore them? Any other tips/tricks?
Intermediate & Advanced SEO | | TellThemEverything0 -
Multiple domain level redirects to unique sub-folder on one domain...
Hi, I have a restaurant menu directory listing website (for example www.menus.com). Restaurant can have there menu listed on this site along with other details such as opening hours, photos ect. An example of a restaurant url might be www.menus.com/london/bobs-pizza. A feature i would like to offer is the ability for Bob's pizza to use the menus.com website listing as his own website (let assume he has no website currently). I would like to purchase www.bobspizza.com and 301 redirect to www.menus.com/london/bobs-pizza Why?
Intermediate & Advanced SEO | | blackrails
So bob can then list bobspizza.com on his advertising material (business cards etc, rather than www.menus.com/london/bobs-pizza). I was considering using a 301 redirect for this though have been told that too many domain level redirects to one single domain can be flagged as spam by Google. Is there any other way to achieve this outcome without being penalised? Rel canonical url, url masking? Other things to note: It is fine if www.bobspizza.com is NOT listed in search results. I would ideally like any link juice pointing to www.bobspizza.com to pass onto www.menus.com though this is a nice to have. If it comes at the cost of being penalised i can live without the link juice from this. Thanks0 -
Magento: URLs for Products in Multiple Categories
I am working in Magento to build out a large e-commerce site with several thousand products. It's a great platform, but I have run into the issue of what it does to URLs when you put a product into multiple categories. Basically, "a book" in two categories would make two URLs for one product: 1) /books/a-book 2) author-name/a-book So, I need to come up with a solution for this. It seems I have two options: Found this from a Magento SEO article: 'Magento gives you the ability to add the name of categories to path for product URL's. Because Magento doesn't support this functionality very well - it creates duplicate content issues - it is a very good idea to disable this. To do this, go to System => Configuration => Catalog => Search Engine Optimization and set "Use categories path for product URL's to "no".' This would solve the issues and be a quick fix, but I think it's a double edged sword, because then we lose the SEO value of our well named categories being in the URL. Use Canonical tags. To be fair, I'm not even sure this is possible. Even though it is creating different URLs and, thus, poses a risk of "duplicate content" being crawled, there really is only one page on the admin side. So, I can't go to all of the "duplicate" pages and put a canonical tag, because those duplicate pages don't really exist on the back-end. Does that make sense? After typing this out, it seems like the best thing to do probably will be to just turn off categories in the URL from the admin side. However, I'd still love any input from the community on this. Thanks!
Intermediate & Advanced SEO | | Marketing.SCG0