Disallowed "Search" results with robots.txt and Sessions dropped
-
Hi
I've started working on our website and I've found millions of "Search" URL's which I don't think should be getting crawled & indexed (e.g. .../search/?q=brown&prefn1=brand&prefv1=C.P. COMPANY|AERIN|NIKE|Vintage Playing Cards|BIALETTI|EMMA PAKE|QUILTS OF DENMARK|JOHN ATKINSON|STANCE|ISABEL MARANT ÉTOILE|AMIRI|CLOON KEEN|SAMSONITE|MCQ|DANSE LENTE|GAYNOR|EZCARAY|ARGOSY|BIANCA|CRAFTHOUSE|ETON).I tried to disallow them on the Robots.txt file, but our Sessions dropped about 10% and our Average Position on Search Console dropped 4-5 positions over 1 week. Looks like over 50 Million URL's have been blocked, and all of them look like all of them are like the example above and aren't getting any traffic to the site.
I've allowed them again, and we're starting to recover. We've been fixing problems with getting the site crawled properly (Sitemaps weren't added correctly, products blocked from spiders on Categories pages, canonical pages being blocked from Crawlers in robots.txt) and I'm thinking Google were doing us a favour and using these pages to crawl the product pages as it was the best/only way of accessing them.
Should I be blocking these "Search" URL's, or is there a better way about going about it??? I can't see any value from these pages except Google using them to crawl the site.
-
If you have a site with, at least 30k URLs, looking at only 300 keywords won't reflect the general status of the whole site. If you are looking for a 10% loss in traffic, I'd start by chasing the pages that lost more traffic, then analyzing whether they lost rankings or if there are some other issues.
Another way to find where there is traffic loss is in search Console, looking at keywords that aren't in the top300. There might be a lot to analyze.
It's not a big deal having a lot of pages blocked in robots.txt when what's blocked is correctly blocked. Keep in mind that GSC will flag those pages with warnings as they were previously indexed and now are blocked. That's just how they've set up flags.
Hope it helps.
Best luck.
Gaston -
If you have a general site which happens to have a search facility, blocking search results is quite usual. If your site is all 'about' searching (e.g: Compare The Market, stuff like that) then the value-add of your site is how it helps people to find things. In THAT type of situation, you absolutely do NOT want to block all your search URLs
Also, don't rule out seasonality. Traffic naturally goes up and down, especially at this time of year when everyone is on holiday. How many people spend their holidays buying stuff or doing business stuff online? They're all at the beach - mate!
-
Hi Gaston
"Search/" pages were getting a small amount of traffic, and a tiny bit of revenue, but I definitely don't think they need to be indexed or are important to users. We're down in mainly "Sale" & "Brand" pages, and I've heard the Sale in general across the store isn't going well, but don't think I can go back management with that excuse
I think my sitemaps are sorted now, I've broken them down into 6 x 5,000 URL files, and all the canonical tags seem to be fine and pointing to these URL's. I am a bit concerned that URL's "blocked by robots.txt" shot up from 12M to 73M, although all the URLs Search Console are showing me look like they need to be blocked!
We've also tracking nearly 300 Keywords, and they've actually had good improvements in the same period. Finding it hard to explain it!
-
Hi Frankie,
My guess is that the traffic you were losing was because of its traffic driven by /search pages.
The questions you should be asking are:
- Are those /search pages getting traffic?
- Are them important to users?
- After being disallowed, which pages were losing traffic?
As a general rule, Google doesn't want to crawl nor index internal search pages, unless they have some value to users.
On another matter, the crawlability of your product pages can be easily solved with a sitemap file. If you are worried about the size of it, remember that it can contain up to 50k URLs and you can create several sitemaps and list them in a sitemap index.
More info about that here: Split up your large sitemaps - Google Search Console HelpHope it helps.
Best luck,
Gaston
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"Avoid Too Many Internal Links" when you have a mega menu
Using the on-page grader and whilst further investigating internal linking, I'm concerned that as the ecommerce website has a very link heavy mega menu the rule of 100 may be impeding on the contextual links we're creating. Clearly we don't want to no-follow our entire menu. Should we consider no-indexing the third-level- for example short sleeve shirts here... Clothing > Shirts > Short Sleeve Shirts What about other pages we're don't care to index anyway such as the 'login page' the 'cart' the search button? Any thoughts appreciated.
Intermediate & Advanced SEO | | Ant-Scarborough0 -
Huge spike in "access denied" in search console
Hey Guys, We have seen a huge spike in "Access Denied" status in the google search console for our website and I have no idea why that would be the case. Is there anyone that can shed some light on what is going on or who can point me in the direction of an SEO specialist that we can pay to fix the issue?? Thanks denied.png
Intermediate & Advanced SEO | | fbchris0 -
How Google organic search results differ in Local Searches?
We all know Google displays nearby results by locating our ip address. My question is how does these results differ? For eg 1. If someone from Newyork search for "chinese Restaurant in Newyork" 2. Someone from California search for "chinese Restaurant in Newyork" 3. Someone from California changes his location to Newyork and search for "chinese Restaurant in Newyork" What are the factors the Google SERP looks into to display the result in local terms?
Intermediate & Advanced SEO | | rajeevEDU0 -
Why is "Noindex" better than a "Canonical" for Pagination?
"Noindex" is a suggested pagination technique here: http://searchengineland.com/the-latest-greatest-on-seo-pagination-114284, and everyone seems to agree that you shouldn't canonicalize all pages in a series to the first page, but I'd love if someone can explain why "noindex" is better than a canonical?
Intermediate & Advanced SEO | | nicole.healthline0 -
Can changing G+ authorship on a well-ranking article drop its search ranking?
We have an article that ranks #1 in Google SERP for the keyword we want it to rank for. We decided to revise the article because although it's performing well, we knew it could be better and more informative for the user. Now that we've revised the content, we're wondering: Should we update the article author (and the G+ authorship markup) to reflect that the revisor authored the content, or keep the original author listed? Can changing G+ authorship on an article impact its search ranking, or is that an issue that's a few Google algorithm updates down the road?
Intermediate & Advanced SEO | | pasware0 -
What's the best way to check Google search results for all pages NOT linking to a domain?
I need to do a bit of link reclamation for some brand terms. From the little bit of searching I've done, there appear to be several thousand pages that meet the criteria, but I can already tell it's going to be impossible or extremely inefficient to save them all manually. Ideally, I need an exported list of all the pages mentioning brand terms not linking to my domain, and then I'll import them into BuzzStream for a link campaign. Anybody have any ideas about how to do that? Thanks! Jon
Intermediate & Advanced SEO | | JonMorrow0 -
Location appearing on search result. how can this be achieved?
I'm pretty sure this site is not doing any SEO but i think what made them no. 1 is the location. I already tried adding a google publisher tag to my site that points to my google page which contains my address but i still can't have the location appear.. here's a screenshot of the search result that i want to achieve: https://www.dropbox.com/s/tbdv3121rrs6zp5/Screen Shot 2013-04-15 at 9.39.30 AM.png Screen%20Shot%202013-04-15%20at%209.39.30%20AM.png
Intermediate & Advanced SEO | | optimind0 -
Robots.txt: Link Juice vs. Crawl Budget vs. Content 'Depth'
I run a quality vertical search engine. About 6 months ago we had a problem with our sitemaps, which resulted in most of our pages getting tossed out of Google's index. As part of the response, we put a bunch of robots.txt restrictions in place in our search results to prevent Google from crawling through pagination links and other parameter based variants of our results (sort order, etc). The idea was to 'preserve crawl budget' in order to speed the rate at which Google could get our millions of pages back in the index by focusing attention/resources on the right pages. The pages are back in the index now (and have been for a while), and the restrictions have stayed in place since that time. But, in doing a little SEOMoz reading this morning, I came to wonder whether that approach may now be harming us... http://www.seomoz.org/blog/restricting-robot-access-for-improved-seo
Intermediate & Advanced SEO | | kurus
http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions Specifically, I'm concerned that a) we're blocking the flow of link juice and that b) by preventing Google from crawling the full depth of our search results (i.e. pages >1), we may be making our site wrongfully look 'thin'. With respect to b), we've been hit by Panda and have been implementing plenty of changes to improve engagement, eliminate inadvertently low quality pages, etc, but we have yet to find 'the fix'... Thoughts? Kurus0