20 000 duplicates in Moz crawl due to Joomla URL parameters. How to fix?
-
We have a problem of massive duplicate content in Joomla. Here is an example of the "base" URL: http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html
For some reason Joomla creates many versions of this URL, for example:
or
So it lists the URL parameter ?q= and then repeats part of the beforegoing URL. This leads to tens of thousands duplicate pages in our content heavy site.
Any ideas how to fix this? Thanks so much!
-
These are caused by the links to your language pages. If you click one of the language links from within the source code (not on the page) it redirects to a URL with '?q=/index.php/Web-Pages/binary-options-platforms.html' added. Then if you click the same language link on that page it again redirects to another page with previous URL added to the end:
?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html.e.g:
On the example page view source, search for German and click the link below:
This link 301 redirects too:
http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html Then if you view source, search for German and click the link again:
This link 301 redirects too:
So basically every time a web crawler follows a language link, new URLs are being created with the previous URL added to the end, causing a never ending crawl as an infinite amount of new pages will always be created.
I don't think this is connected with the Joomla SEF as Chris pointed out, as your URLs are already SEF.
However it's not an easy thing to identify how to fix the issue with the language links. You should probably speak to the developer who implemented it and/or the creator of the plugin if it is a plugin.
Also do you even need this functionality? As none of the language links work, they just redirect back the main site.
-
Surely your URL structure is not fine.Can you please try this fix and update me?
http://docs.joomla.org/Enabling_Search_Engine_Friendly_(SEF)_URLs_on_Apache
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removed URLs
recently my site has got some problem some of my URLs are repeating in the SERP ! I removed them by search console and also site : but they show up again Does anyone know what is wrong?
Technical SEO | | talaabshode20200 -
How to add parameter to url with 301 and wildcard
So this is my situation. I want to redirect : example.com/post1/ to example.com/post1/?m=yes
Technical SEO | | CarlLSweet0 -
Why are my 301 redirects and duplicate pages (with canonicals) still showing up as duplicates in Webmaster Tools?
My guess is that in time Google will realize that my duplicate content is not actually duplicate content, but in the meantime I'd like to get your guys feedback. The reporting in Webmaster Tools looks something like this. Duplicates /url1.html /url2.html /url3.html /category/product/url.html /category2/product/url.html url3.html is the true canonical page in the list above._ url1.html,_ and url2.html are old URLs that 301 to url3.html. So, it seems my bases are covered there. _/category/product/url.html _and _/category2/product/url.html _ do not redirect. They are the same page as url3.html. Each of the category URLs has a canonical URL of url3.html in the header. So, it seems my bases are covered there as well. Can I expect Google to pick up on this? Why wouldn't it understand this already?
Technical SEO | | bearpaw0 -
Is this duplicate content?
All the pages have same information but content is little bit different, is this low quality and considered as duplicate content? I only trying to make services pages for each city, any other way for doing this. http://www.progressivehealthofpa.com/brain-injury-rehabilitation-pennsylvania/
Technical SEO | | JordanBrown
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-new-york/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-new-jersey/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-connecticut/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-maryland/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-massachusetts/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-philadelphia/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-new-york-city/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-baltimore/
http://www.progressivehealthofpa.com/brain-injury-rehabilitation-boston/0 -
Affiliate Link is Trumping Homepage - URL parameter handling?
An odd and slightly scary thing happened today: we saw an affiliate string version of our homepage ranking number one for our brand, along with the normal full set of site-links. We have done the following: 1. Added this to our robots.txt : User-agent: *
Technical SEO | | LawrenceNeal
Disallow: /*? 2. Reinserted a canonical on the homepage (we had removed this when we implemented hreflang as had read the two interfered with each other. We haven't had canonical for a long time now without issue. Is this anything to do with the algo update perhaps?! The third thing we're reviewing I'm slightly confused about: URL Parameter Handling in GWT. As advised - with regard to affiliate strings - to the question: "Does this parameter change page content seen by the user?" We have NO selected, which means they should be crawling one representative URL. But isn't it the case that we don't want them crawling or indexing ANY affiliate URLs? You can specify Googlebot to not crawl any of particular string, but only if you select: "Yes. The parameter changes the page content." Should they know an affiliate URL from the original and not index them? I read a quote from Matt Cutts which suggested this (along with putting a "nofollow" tag in affiliate links just in case) Any advice in this area would be appreciated. Thanks.0 -
Duplicate Content Due to Pagination
Recently our newly designed website has been suffering from a rankings loss. While I am sure there are a number of factors involved, I'd like to no if this scenario could be harmful... Google is showing a number of duplicate content issues within Webmaster Tools. Some of what I am seeing is duplicate Meta Titles and Meta Descriptions for page 1 and page 2 of some of my product category pages. So if a category has many products and has 4 pages, it is effectively showing the same page title and meta desc. across all 4 pages. I am wondering if I should let my site show, say 150 products per page to get them all on one page instead of the current 36 per page. I use the Big Commerce platform. Thank you for taking the time to read my question!
Technical SEO | | josh3300 -
Fixing Crawl Errors
Hi! I moved my Wordpress blog back in August, and lost much of my site traffic. I recently found over 1000 crawl errors in Webmaster Tools because some of my redirects weren't transferred, so we are working on fixing the errors and letting Google know. I'm wondering how long I should expect for Google to recognize that the errors have been fixed and for the traffic to start returning? Thanks! Jodi - momsfavoritestuff.com
Technical SEO | | JodiFTM0 -
Do index.php extensions count as duplicate content on Joomla sites?
When i run my error report, i see 2 duplicate pages, but both are the main domain and then the /index.php extension. how do i fix this? does it really count as duplicate content?
Technical SEO | | valetseo0