Google Crawler Error / restricting crawling
-
Hi
On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant.
The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error.
Currently Google is attempting to crawl these pages twice a day.
I have implemented the following to stop this:-
1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri:
http://www.domian.com/catalogsearch/advanced/result/
2. Added Disallow to robots.txt
Disallow: /catalogsearch/advanced/result/* Disallow: /catalogsearch/advanced/result/
3. Add rel="nofollow" to the links in the site linking to the advanced search.
Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status.
Can anything else be done?
-
Seems like you've done everything right. You could also add a Meta robots "NOINDEX, FOLLOW" to those pages.
I'd also double check the referring "linked from" referrer in Webmasters tools just to make sure you haven't missed any live followed links pointing to those pages.
When did you submit the removal request, and what is the status? (approved, denied, pending?) Another question, are those pages in Google's index?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirect Error
Hello, I was sent a report from a colleague containing redirect errors: The link to "http://www.xxxx.com/old-page/" has resulted in HTTP redirection to "http://www.xxxx.com/new-page".Search engines can only pass page rankings and other relevant data through a single redirection hop. Using unnecessary redirects can have a negative impact on page ranking. Our site is host on Microsoft Servers (IIS). I'm not sure what is causing these errors. Would it be the way the redirect was implemented.
Technical SEO | | 3mobileIreland0 -
When to use mod rewrite / canonical / 301 redirect
Hello, I have taken over the management of a site which has a big problem with duplicate content. The duplicate content is caused by two things: Upper and lower case urls e.g: www.mysite.com/blog and www.mysite.com/Blog The other reason is the use of product filters / pagination which mean you can get to the same 'page' via different filters. The filters generate separate URLs. http://www.mysite.com/casestudy
Technical SEO | | Barques-Design
http://www.mysite.com/casestudy/filter?page=1
http://www.mysite.com/casestudy/filter?solution=0&page=1
http://www.mysite.com/casestudy?page=1
http://www.cpio.co.uk/casestudy/filter?solution=0" Am I right to assume that for the case sensitive URLs I should use a 301 redirect because I only want the lower page to be shown? For the issue with dynamic URLs should we implement a mod-rewrite and 301 to one page? Any advice would be greatly appreciated.
Mat0 -
4XX(Client Error)
Hello there Please help! I am getting this kind of error in the whole site. http://www.mileycyrus-online.co.uk/leaked-hannah-montana-the-movie-pictures.html/comments Running on wordpress site. I chagned the template few times.. most of the error ends with a /comments. Infact all my post has the same issue: http://www.mileycyrus-online.co.uk/miley-cyrus-at-golden-globes-ceremony.html/comments http://www.mileycyrus-online.co.uk/miley-cyrus-at-president-obamas-inauguration-concert.html/comments 404 Error.
Technical SEO | | ExpertSolutions0 -
Google Analytics
I usually have Google analytics "real-time" running on one of my monitors, occasionally I glance at the screen to see that are TOP KEYWORDS people are using, lately there have been a lot of long-tail keywords. If I try to copy and paste the queries into google, i can never seem to find us organically for the long-tail searches? Is the real-time feature accurate? Thank you!
Technical SEO | | TP_Marketing0 -
Google Search Parameters
Couple quick questions. Is using the parameter pws=0 still useful for turning off personalization? Is there a way to set my location as a URL parameter as well? For instance, I want to set my location to United States, can this be done with a URL param the same way as pws=0?
Technical SEO | | nbyloff0 -
Google plus
With Google search plus your world, would i see results ONLY from Google plus followers ? or from someone who is my facebook friend as well.
Technical SEO | | seoug_20050 -
How to increase the crawl rate?
hello, Our site was hosted in North America and Google was crawling it reasonably fast. Since our traffic is mostly from India we moved it to India, now the crawling is terribly slow from Google. Is there anyway to fix the crawl rate(we have increased the crawl rate in GWT)
Technical SEO | | greyniumseo0 -
Google crawl index issue with our website...
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server. In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder. An example: Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky. We called our hosting company who said the problem isn't coming from them... Has anyone had something like this happen to them? Thanks so much for your insight!
Technical SEO | | ILM_Marketing
Megan0