Google Crawler Error / restricting crawling
-
Hi
On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant.
The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error.
Currently Google is attempting to crawl these pages twice a day.
I have implemented the following to stop this:-
1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri:
http://www.domian.com/catalogsearch/advanced/result/
2. Added Disallow to robots.txt
Disallow: /catalogsearch/advanced/result/* Disallow: /catalogsearch/advanced/result/
3. Add rel="nofollow" to the links in the site linking to the advanced search.
Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status.
Can anything else be done?
-
Seems like you've done everything right. You could also add a Meta robots "NOINDEX, FOLLOW" to those pages.
I'd also double check the referring "linked from" referrer in Webmasters tools just to make sure you haven't missed any live followed links pointing to those pages.
When did you submit the removal request, and what is the status? (approved, denied, pending?) Another question, are those pages in Google's index?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will adding /blog/ to my urls affect SEO rankings?
Following advice from an external SEO agency I removed /blog/ from our permalinks late last year. The logic was that it a) doesn't help SEO and b) reduces the character count for the slug. Both points make sense. However, it makes segmenting blog posts from other content in Google Analytics impossible. If I were to add /blog/ back into my URLs, and redirected the permalinks, would it harm my rankings? Thanks!
Technical SEO | | GerardAdlum0 -
Googlebot crawl error Javascript method is not defined
Hi All, I have this problem, that has been a pain in the ****. I get tons of crawl errors from "Googlebot" saying a specific Javascript method does not exist in my logs. I then go to the affected page and test in a web browser and the page works without any Javascript errors. Can some help with resolving this issue? Thanks in advance.
Technical SEO | | FreddyKgapza0 -
Crawl depth and www
I've run a crawl on a popular amphibian based tool, just wanted to confirm... should http://www.homepage be at crawl depth 0 or 1? The audit shows http://homepage at level 0 and http://www.homepage at level 1 through a redirect. Thanks
Technical SEO | | Focus-Online-Management0 -
Google Bot Noindex
If a site has the tag, can it still be flagged for duplicate content?
Technical SEO | | MayflyInternet0 -
Google webmaster tools says access denied error 403
Hi, this keeps on happening, just check early today and it tells me i have access denied and 403 errors I have this from time to time in my google webmaster tools and i have checked the pages and they work properly, so i am puzzled why this has happened. I have contacted my hosting company who have said there is not a problem but there must be a problem somewhere which could affect my site rankings. can anyone let me know what this could be please. i work in joomla | parenting-magazine | 403 | 8/10/13 |
Technical SEO | | ClaireH-184886
| | 2 | personal-finance-money-advice | 403 | 8/10/13 |
| | 3 | 201308081607/emmerdale/emmerdale-chas-confronts-cameron-over-affair-with-debbie | 403 | 8/10/13 |
| | 4 | 201308081606/emmerdale/emmerdale-declan-gets-a-visit-from-the-police | 403 | 8/10/13 |
| | 5 | 201308081608/emmerdale/emmerdale-cameron-debbie-affair-is-out-in-the-open | 403 | 8/10/13 |
| | 6 | 201308081614/uk-holiday-news/visitscotland-launch-campaign-to-boost-tourism | 403 | 8/10/13 |
| | 7 | dog-advice/training-your-puppy-a-beginners-guide | 403 | 8/10/13 |
| | 8 | gadgets/hp-envy-13-laptop-review | 403 | 8/10/13 |
| | 9 | gadget-talk/everyday-smartphone-gadgets-which-could-revolutionise-your-life | 403 | 8/10/13 |
| | 10 | news-gadgets/the-htc-one-mobile-phone-review | 403 | 8/10/13 |
| | 11 | gadget-talk/five-iphone-apps-for-home-improvement | 403 | 8/10/13 |
| | 12 | gadget-talk/are-android-apps-useful-for-business-success | 403 | 8/10/13 |
| | 13 | gadget-talk/television-gadgets-the-future-of-television-is-coming | 403 | 8/10/13 | | | |0 -
Blog.furnacefilterscanada.com/ or furnacefilterscanada.com/blog/
My shopping cart does not allow to instal a WordPress blog on a sub-domain like: furnacefilterscanada.com/blog/ But I can host my blog on another server with a sub-domain like: blog.furnacefilterscanada.com In a SEO point of view is there a difference between the 2? Link juice? Page authority? Thank you, BigBlaze
Technical SEO | | BigBlaze2050 -
How do I keep Google from crawling my PPC landing page?
Question, I don't want Google to catch my PPC landing pages, but I'm wondering if I make the page no-follow, no-index will it still crawl my landing page for quality score? Just to clarify I do want it to crawl the landing page to boost my quality score on PPC, but I do not want it to index it for SEO. Thanks 🙂
Technical SEO | | jhinchcliffe0