Google Crawler Error / restricting crawling
-
Hi
On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant.
The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error.
Currently Google is attempting to crawl these pages twice a day.
I have implemented the following to stop this:-
1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri:
http://www.domian.com/catalogsearch/advanced/result/
2. Added Disallow to robots.txt
Disallow: /catalogsearch/advanced/result/* Disallow: /catalogsearch/advanced/result/
3. Add rel="nofollow" to the links in the site linking to the advanced search.
Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status.
Can anything else be done?
-
Seems like you've done everything right. You could also add a Meta robots "NOINDEX, FOLLOW" to those pages.
I'd also double check the referring "linked from" referrer in Webmasters tools just to make sure you haven't missed any live followed links pointing to those pages.
When did you submit the removal request, and what is the status? (approved, denied, pending?) Another question, are those pages in Google's index?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving from www.domain.com/nameofblog to www.domain.com/blog
Describe your question in detail. The more information you give, the better! It helps give context for a great answer I have had my blog located at www.legacytravel.com/ramblings for a while. I now believe that, from an SEO perspective, it would be preferable to move it to www.legacytravel.com/blog. So, I want to be able to not lose any links (few though they may be) with the move. I believe I would need to do a 301 redirect in the htaccess file of www.legacytravel.com that will tell anyone who comes knocking on the door of www.legacytravel.com/ramblings/blah blah blah that now what they want is at www.legacytravel.com/blog/blah blah blah Is that correct? What would the entry look like in the htaccess? Thank you in advance.
Technical SEO | | cathibanks0 -
Disallow: /404/ - Best Practice?
Hello Moz Community, My developer has added this to my robots.txt file: Disallow: /404/ Is this considered good practice in the world of SEO? Would you do it with your clients? I feel he has great development knowledge but isn't too well versed in SEO. Thank you in advanced, Nico.
Technical SEO | | niconico1011 -
Is it needed to use http:// or not?
Hi, When doing link building and getting my URL to other websites, is it necessary that the other websites include http:// before the domain? Example: I want to increase the number of links to my site www.example.com . When I ask other websites to add a link to my site, should I ask them to use http://www.example.com or is www.example.com enough (without http://)? Or it really does not matter? Thanks in advance, Sam
Technical SEO | | Awaraman0 -
Google Dancing?
Hello, I was wondering why my website for some keywords goes from 2nd 3rd page in Google to 7th or even more sometimes? This happens since a while. Any suggestion? Thanks. Eugenio
Technical SEO | | socialengaged0 -
De-indexed from Google
Hi Search Experts! We are just launching a new site for a client with a completely new URL. The client can not provide any access details for their existing site. Any ideas how can we get the existing site de-indexed from Google? Thanks guys!
Technical SEO | | rikmon0 -
Remove more than 1000 crawl errors from GWT in one day?
In google webmasters tools you have the feature "Crawl Errors". This one displays the top 1000 crawl errors google have on your site. I have around 16k crawl errors at the moment, which all are fixed. But i can only mark 1000 of them as fixed each day/each time google crawls the site. (This as it only displays top 1000 errors. When i have marked those as fixed it won't show other errors for a while.) Does anyone know if it's possible to mark ALL errors as fixed in one operation?
Technical SEO | | Host10 -
Why is this url showing as "not crawled" on opensiteexplorer, but still showing up in Google's index?
The below url is showing up as "not crawled" on opensitexplorer.com, but when you google the title tag "Joel Roberts, Our Family Doctors - Doctor in Clearwater, FL" it is showing up in the Google index. Can you explain why this is happening? Thank you http://doctor.webmd.com/physician_finder/profile.aspx?sponsor=core&pid=14ef09dd-e216-4369-99d3-460aa3c4f1ce
Technical SEO | | nicole.healthline0