20 000 duplicates in Moz crawl due to Joomla URL parameters. How to fix?
-
We have a problem of massive duplicate content in Joomla. Here is an example of the "base" URL: http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html
For some reason Joomla creates many versions of this URL, for example:
or
So it lists the URL parameter ?q= and then repeats part of the beforegoing URL. This leads to tens of thousands duplicate pages in our content heavy site.
Any ideas how to fix this? Thanks so much!
-
These are caused by the links to your language pages. If you click one of the language links from within the source code (not on the page) it redirects to a URL with '?q=/index.php/Web-Pages/binary-options-platforms.html' added. Then if you click the same language link on that page it again redirects to another page with previous URL added to the end:
?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html.e.g:
On the example page view source, search for German and click the link below:
This link 301 redirects too:
http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html Then if you view source, search for German and click the link again:
This link 301 redirects too:
So basically every time a web crawler follows a language link, new URLs are being created with the previous URL added to the end, causing a never ending crawl as an infinite amount of new pages will always be created.
I don't think this is connected with the Joomla SEF as Chris pointed out, as your URLs are already SEF.
However it's not an easy thing to identify how to fix the issue with the language links. You should probably speak to the developer who implemented it and/or the creator of the plugin if it is a plugin.
Also do you even need this functionality? As none of the language links work, they just redirect back the main site.
-
Surely your URL structure is not fine.Can you please try this fix and update me?
http://docs.joomla.org/Enabling_Search_Engine_Friendly_(SEF)_URLs_on_Apache
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search Console - Should I request to index redirected URL or Mark as fixed?
Hi all, Many blog posts used to be showing 404s when doing crawl tests and in search console (despite being there when visited.) I realized it was an issue with URL structure. It used to be example.com/post-name I've fixed the issue by changing the URL structure in Wordpress so that they now follow the structure of example.com/post-type/post-name According to sitemaps, Google has now indexed all posts in /post-type/post-name. My question is what to do with crawl errors in Search Console that are still there for example.com/postname. When I fetch, I get a redirect status (which is accurate). At this point should I request to index or mark as fixed? Thank you!
Technical SEO | | MouthyPR0 -
Duplicate Content due to CMS
The biggest offender of our website's duplicate content is an event calendar generated by our CMS. It creates a page for every day of every year, up to the year 2100. I am considering some solutions: 1. Include code that stops search engines from indexing any of the calendar pages 2. Keep the calendar but re-route any search engines to a more popular workshops page that contains better info. (The workshop page isn't duplicate content with the calendar page). Are these solutions possible? If so, how do the above affect SEO? Are there other solutions I should consider?
Technical SEO | | ycheung0 -
Mobile URL parameter (Redirection to desktop)
Hello, We have a parallel mobile website and recently we implemented a link pointing to the desktop website. This redirect is happening via a javascript code and results in a url followed by this paramenter: ?m=off Example:
Technical SEO | | echo1
http://www.m.website.com redirects to:
http://www.website.com/?m=off Questions: Will the "http://www.website.com/?m=off" be considered duplicate content with "http://www.website.com" since they both return the same content? Is there any possibility that Google will take into consideration the url ending in "/?m=off"? How should we treat this new url? The webmaster tools URL parameter configuration at the moment isn't experiencing problems but should we submit the parameter anyway in order not to be indexed or should we wait first and see the error response? In case we should submit this for removal... what's the best way to do it? Like this? Parameter: ?m=off Does this parameter change page content seen by the user? - doesn't affect page content Any help is much appreciated.
Thank you!0 -
Problem with duplicate content
Hi, My problem is this: SEOmoz tells me I have duplicate content because it is picking up my index page in three different ways: http://www.web-writer-articles.co.uk http://www.web-writer-articles.co.uk/ and http://www.web-writer-articles.co.uk/index.php Can someone give me some advice as to how I can deal with this issue? thank you for your time, louandel15
Technical SEO | | louandel150 -
Shorter URLs
Hi Is there a real value in having the keywords in the URL structure? we could use the URL: Mybrand.com/software/tablets/ipad/supertrader.html Or instead have the CMS create the shorter version mybrand.com/supertrader.html and just optimize this page for the keyword 'supertrader ipad software'
Technical SEO | | FXDD1 -
Keyword and URL
I have a client who has a popular name (like 'Joe Smith'). His blog URL has only his first name and the name of his company in it, like joe.company.com. His blog doesn't rank well at all in the first 3-4 Google SERPs. I was thinking of advising him to change the URL of his blog to joesmith.company.com, and having his webmaster do 301 redirects from the old URL to the new one. Do you think this is a good strategy, or would you recommend something else? I realize ranking isn't just about the URL, it's about links, etc. But I think making his URL more specific to his name could help. Any advice greatly appreciated! Jim
Technical SEO | | JamesAMartin0 -
Duplicate content handling.
Hi all, I have a site that has a great deal of duplicate content because my clients list the same content on a few of my competitors sites. You can see an example of the page here: http://tinyurl.com/62wghs5 As you can see the search results are on the right. A majority of these results will also appear on my competitors sites. My homepage does not seem to want to pass link juice to these pages. Is it because of the high level of Dup Content or is it because of the large amount of links on the page? Would it be better to hide the content from the results in a nofollowed iframe to reduce duplicate contents visibilty while at the same time increasing unique content with articles, guides etc? or can the two exist together on a page and still allow link juice to be passed to the site. My PR is 3 but I can't seem to get any of my internal pages(except a couple of pages that appear in my navigation menu) to budge of the PR0 mark even if they are only one click from the homepage.
Technical SEO | | Mulith0