20 000 duplicates in Moz crawl due to Joomla URL parameters. How to fix?
-
We have a problem of massive duplicate content in Joomla. Here is an example of the "base" URL: http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html
For some reason Joomla creates many versions of this URL, for example:
or
So it lists the URL parameter ?q= and then repeats part of the beforegoing URL. This leads to tens of thousands duplicate pages in our content heavy site.
Any ideas how to fix this? Thanks so much!
-
These are caused by the links to your language pages. If you click one of the language links from within the source code (not on the page) it redirects to a URL with '?q=/index.php/Web-Pages/binary-options-platforms.html' added. Then if you click the same language link on that page it again redirects to another page with previous URL added to the end:
?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html.e.g:
On the example page view source, search for German and click the link below:
This link 301 redirects too:
http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html Then if you view source, search for German and click the link again:
This link 301 redirects too:
So basically every time a web crawler follows a language link, new URLs are being created with the previous URL added to the end, causing a never ending crawl as an infinite amount of new pages will always be created.
I don't think this is connected with the Joomla SEF as Chris pointed out, as your URLs are already SEF.
However it's not an easy thing to identify how to fix the issue with the language links. You should probably speak to the developer who implemented it and/or the creator of the plugin if it is a plugin.
Also do you even need this functionality? As none of the language links work, they just redirect back the main site.
-
Surely your URL structure is not fine.Can you please try this fix and update me?
http://docs.joomla.org/Enabling_Search_Engine_Friendly_(SEF)_URLs_on_Apache
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I'm struggling to understand (and fix) why I'm getting a 404 error. The URL includes this "%5Bnull%20id=43484%5D" but I cannot find that anywhere in the referring URL. Does anyone know why please? Thanks
Can you help with how to fix this 404 error please? It appears that I have a redirect from one page to the other, although the referring page URL works, but it appears to be linking to another URL with this code at the end of the the URL - %5Bnull%20id=43484%5D that I'm struggling to find and fix. Thanks
Technical SEO | | Nichole.wynter20200 -
Duplicate content
I have one client with two domains, identical products to appear on both domains. How should I handle this?
Technical SEO | | Hazel_Key0 -
URL Format
Often we have web platforms that have a default URL structure that looks something like this www.widgetcompany.co.uk/widget-gallery/coloured-widgets/red-widgets This format is quite well structured but would it just be more effective to be www.widgetcompany.co.uk/red-widgets? I realise that it may depend on a lot of factors but generally is it better to have the shorter URL if targeting the key phrase "red widgets" One thing, it certainly looks a bit keyword stuffy with all those "widgets"
Technical SEO | | vital_hike0 -
Duplicated content in moz report due to Magento urls in a multiple language store.
Hi guys, Moz crawl is reporting as duplicated content the following urls in our store: http://footdistrict.com and http://footdistrict.com?___store=footdistrict_es The chain: ___store=footdistrict_es is added as you switch the language of the site. Both pages have the http://footdistrict.com" /> , but this was introduced some time after going live. I was wondering the best action to take considering the SEO side effects. For example: Permanent redirect from http://footdistrict.com?___store=footdistrict_es to http://footdistrict.com. -> Problem: If I'm surfing through english version and I switch to spanish, apache will realize that http://footdistrict.com?___store=footdistrict_es is going to be loaded and automatically it will redirect you to http:/footdistrict.com. So you will stay in spanish version for ever. Deleting the URLS with the store code from Google Web Admin tools. Problem: What about the juice? Adding those URL's to robots.txt. Problem: What about the juice? more options? Basically I'm trying to understand the best option to avoid these pages being indexed. Could you help here? Thanks a lot.
Technical SEO | | footd0 -
Are duplicate page titles fixed by the canonical tag
Google Web Master Tools is saying that some of my pages have duplicate page titles because of pagination. However, I have implemented the canonical tag on the paginated pages which I thought would keep my site from being penalized for duplicate page titles. Is this correct? Or does canonical tag only relate to duplicate content issues?
Technical SEO | | Santaur0 -
Duplicate content
I'm getting an error showing that two separate pages have duplicate content. The pages are: | Help System: Domain Registration Agreement - Registrar Register4Less, Inc. http://register4less.com/faq/cache/11.html 1 27 1 Help System: Domain Registration Agreement - Register4Less Reseller (Tucows) http://register4less.com/faq/cache/7.html | These are both registration agreements, one for us (Register4Less, Inc.) as the registrar, and one for Tucows as the registrar. The pages are largely the same, but are in fact different. Is there a way to flag these pages as not being duplicate content? Thanks, Doug.
Technical SEO | | R4L0 -
Moz Crawl Reporting Duplicate content on "template" styled pages
We have a lot of detail pages on our site that reference specific scholarships. Each page has a different Title and Description. They also have unique information all regarding the same data points. The pages are displayed in a similar structure to the user so the data is easy to read. My problem is a lot of these pages are being reported as duplicate content when they certainly are not. Most of them are reported as duplicates when they have the same sponsor. They may have the same contact information listed. These two are being reported as duplicate of each other. They share some data but they are definitely different scholarships. http://www.collegexpress.com/scholarships/adelaide-mcclelland-garden-club-scholarship/9254/ http://www.collegexpress.com/scholarships/mary-wannamaker-witt-and-lee-hampton-witt-memorial-scholarship/10785/ Would it help to add a Canonical for each page to themselves? Any other suggestions would be great. Thanks
Technical SEO | | GeorgeLaRochelle0 -
Duplicate content?
I have a question regarding a warning that I got on one of my websites, it says Duplicate content. I'm canonical url:s and is also using blocking Google out from pages that you are warning me about. The pages are not indexed by Google, why do I get the warnings? Thanks for great seotools! 3M5AY.png
Technical SEO | | bnbjbbkb0