20 000 duplicates in Moz crawl due to Joomla URL parameters. How to fix?
-
We have a problem of massive duplicate content in Joomla. Here is an example of the "base" URL: http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html
For some reason Joomla creates many versions of this URL, for example:
or
So it lists the URL parameter ?q= and then repeats part of the beforegoing URL. This leads to tens of thousands duplicate pages in our content heavy site.
Any ideas how to fix this? Thanks so much!
-
These are caused by the links to your language pages. If you click one of the language links from within the source code (not on the page) it redirects to a URL with '?q=/index.php/Web-Pages/binary-options-platforms.html' added. Then if you click the same language link on that page it again redirects to another page with previous URL added to the end:
?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html.e.g:
On the example page view source, search for German and click the link below:
This link 301 redirects too:
http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html Then if you view source, search for German and click the link again:
This link 301 redirects too:
So basically every time a web crawler follows a language link, new URLs are being created with the previous URL added to the end, causing a never ending crawl as an infinite amount of new pages will always be created.
I don't think this is connected with the Joomla SEF as Chris pointed out, as your URLs are already SEF.
However it's not an easy thing to identify how to fix the issue with the language links. You should probably speak to the developer who implemented it and/or the creator of the plugin if it is a plugin.
Also do you even need this functionality? As none of the language links work, they just redirect back the main site.
-
Surely your URL structure is not fine.Can you please try this fix and update me?
http://docs.joomla.org/Enabling_Search_Engine_Friendly_(SEF)_URLs_on_Apache
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can we analyze about duplication?
Howdy all, We have a few pages being hailed as copies by the google search comfort. Notwithstanding, we accept the substance on these pages is unmistakably extraordinary (for instance, they have totally unique list items returned, various headings and so on) An illustration of two pages google discover to be copies is underneath. in the event that anybody can spot what may be causing the copy issue here, would especially see the value in ideas! Much appreciated ahead of time.
Technical SEO | | camerpon090 -
Site redesign makes Moz Site Crawl go haywire
I work for an agency. Recently, one of our clients decided to do a complete site redesign without giving us notice. Shortly after this happened, Moz Site Crawl reported a massive spike of issues, including but not limited to 4xx errors. However, in the weeks that followed, it seemed these 4xx errors would disappear and then a large number of new ones would appear afterward, which makes me think they're phantom errors (and looking at the referring URLs, I suspect as much because I can't find the offending URLs). Is there any reason why this would happen? Like, something wrong with the sitemap or robots.txt?
Technical SEO | | YYSeanBrady1 -
Language parameter
Hi there, I have a quick technical question regarding to on site language. Is it possible to use a parameter in order to define the language of the web. My web is www.vallnord.com and its default language is catalan. I would like the site to display in the language of the user, maybe the browser. Thanks, G.
Technical SEO | | SilbertAd0 -
Linklicious and Crawl rates
Can somebody please explain me what is 'crawl rate' and how does 'linklicious' help us with it? I mean I can always visit the website and know more about it, but I want to understand the concept. Please help.
Technical SEO | | KS__0 -
Overly-Dynamic Urls how to fix in SEOMOZ?
Hello. I have about 300 warnings of overly-dynamic urls. In urls like this: http://www.theprinterdepo.com/clearance?dir=asc&order=price&p=10 As you can see all parameters are needed, and my ecommerce solution generates them automatically. How can I get rid of these warnings? I suppose that by using robots.txt, but I have no idea about it. In my google webmaster tools I have already configured that these parameteres the crawler should not index them. Check the image here: http://imageshack.us/photo/my-images/64/37092444.png/
Technical SEO | | levalencia10 -
Duplicate content and tags
Hi, I have a blog on posterous that I'm trying to rank. SEOMoz tells me that I have duplicate content pretty much everywhere (4 articles written, 6 errors at the last crawl). The problem is that I tag my posts, and apparently SEOMoz thinks that it's duplicate content only because I don't have so many posts, so pages end up being very very similar. What can I do in these situations ?
Technical SEO | | ngw0 -
Similar category names result in similar urls and duplicate anchor texts
Hi all, I'm working on an e-commerce website about car tuning and car parts. There are main categories like ( Aerodynamics, Power tuning, Interior, Wheels, Tires, etc. ) and in the products are organized in sub-categories representing the product manufacturer, car manufacturer and car model + modification. Unfortunately this kind of structure creates duplicate sub-category names. For example we can have parts for Audi A4 8K in Aerodynamics and ABT, and the same time we can have Power tuning from the same manufacturer and for the same car, or Sport brakes for the same car by different manufacturers. So here are how some links look-like: /alfa-romeo-147-c1070-en /alfa-romeo-147-c234-en /alfa-romeo-147-c399-en These are totally different categories, with the same anchor text and almost the same url addresses ( the only difference in the urls is the category id ). Can this be affecting the site's indexation, and which can be the better way to create the internal link structure ?
Technical SEO | | mdimov0 -
URL Duplicate Content Issues (Website Transition)
Hey guys, I just transitioned my website and I have a question. I have built up all the link juice around my old url styles. To give you some clarity: My old CMS rendered links like this: www.example.com/sweatbands My new CMS renders links like this: www.example.com/sweatbands/ My new CMS's auto-sitemap also generates them with the slash on the end. Also throughout the website the CMS links to them with the slash at the end and i link to them without the slash (because it's what i am used to). I have the canonical without the slash. Should I just 301 to the version with the slash before google crawls again? I'm worried that i'll lose all the trust and ranking i built up to the one without the slash. I rank very high for certain keywords and some pages house a large portion of our traffic. What a mess! Help! 🙂
Technical SEO | | Hyrule0