Google crawling 200 page site thousands of times/day. Why?
-
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day.
I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief.
So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks!
*edited for grammar.
-
I have a final update for everyone! We discovered the cause of the mysterious increase in crawling. One of our partners tested out a second version of the content on the website (yes, we have two complete sets of content for every page) by swapping out the first set with the second set. The second set caused Google to reevaluate the entire website, crawl it repeatedly thousands of times for two weeks, then stop.
The result of this refresh was a jump in the rankings. We were ranking on page one for about 15% of our targeted keywords and after the new content was inputted it jumped to 71%. Only time will tell if those new rankings will stick, but for now it looks pretty good.
-
Update: after about two weeks the crawl rate returned to normal. We haven't been able to identify a cause yet.
-
It is strange. It's definitely worth looking at access logs and analyzing crawler data there so you can see what pages are getting hit by the crawler just to be sure you understand the activity.
-
Well I would be more then happy if Google would visit my pages more often then once a day. We have around 100k original pages and we also see them visiting 250k pages daily with uplifts to 350k+ which I don't consider to be a bad thing. As long as you're sure about the fact that they see the right pages I would say it's a good thing. The crawl rate really varies day over day for any site, sometimes you get a high rate for a while and then it drops again when Google will find out that your site isn't creating that much new fresh content anymore.
Curious about your idea with the sitemap priority, to my experience + knowledge it doesn't change anything.
-
Yes I have, and yes there are pages that aren't listed in the sitemap and aren't supposed to be there. That's being corrected (we're considering experimenting with priority tags in the sitemap to see if it has an impact over just immediately blocking with robots.txt or meta robots). But if you factor in those pages, it still only amounts to 303 pages.
Weird, right?
-
Have you tried scanning the site with something like screaming frog to make sure there aren't pages that just aren't listed in the sitemap? Ie. tag or category pages, images or other partial content pieces that are creating pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google cache is showing my UK homepage site instead of the US homepage and ranking the UK site in US
Hi There, When I check the cache of the US website (www.us.allsaints.com) Google returns the UK website. This is also reflected in the US Google Search Results when the UK site ranks for our brand name instead of the US site. The homepage has hreflang tags only on the homepage and the domains have been pointed correctly to the right territories via Google Webmaster Console.This has happened before in 26th July 2015 and was wondering if any had any idea why this is happening or if any one has experienced the same issueFDGjldR
Intermediate & Advanced SEO | | adzhass0 -
Google serving wrong page...
Hi, When you Google: "Los Angeles divorce attorney", you will see this site on the 5th page of the SERPS: www.berenjifamilylaw.com/blog/. For some reason, Google is serving the BLOG page as opposed to the homepage. This has been going on now for several weeks. Any tips on how to fix this? Obviously, the Homepage is more relevant and has more links going to it, so not sure why it's happening. Would you just leave it alone? Would you use robots.txt to block Google from crawling the BLOG post page? Thanks.
Intermediate & Advanced SEO | | mrodriguez14400 -
Is my site penalized by Google?
Let's say my website is aaaaa.com and company name is aaaaa Systems. When I search Google aaaaa my site do not come up at all. When I search for "aaaaa Systems" it comes up. But in WMT I see quite a few clicks from aaaaa as keyword. Most of the traffic is brand keywords only. I never received any manual penalty in WMT ever. Is the site penalized or regular algorithm issues?
Intermediate & Advanced SEO | | ajiabs0 -
Google+ Page Question
Just started some work for a new client, I created a Google+ page and a connected YouTube page, then proceeded to claim a listing for them on google places for business which automatically created another Google+ page for the business listing. What do I do in this situation? Do I delete the YouTube page and Google+ page that I originally made and then recreate them using the Google+ page that was automatically created or do I just keep both pages going? If the latter is the case, do I use the same information to populate both pages and post the same content to both pages? That doesn't seem like it would be efficient or the right way to go about handling this but I could be wrong.
Intermediate & Advanced SEO | | goldbergweismancairo0 -
Why would one of our section pages NOT be indexed by Google?
One of our higher traffic section pages is not being indexed by Google. The products that reside on this section page ARE indexed by Google and are on page 1. So why wouldn't the section page be even listed and indexed? The meta title is accurate, meta description is good. I haven't received any notices in Webmaster Tools. Is there a way to check to see if OTHER pages might also not be indexed? What should a small ecom site do to see about getting it listed? SOS in Modesto. Ron
Intermediate & Advanced SEO | | yatesandcojewelers0 -
Google+ Personal Page pass link juice?
I noticed recently that a clients google plus business page (Set up as a personal page) has a followed link pointing to their site. They have many links on the web pointing to the google+ page, however that page is an https page. So the question is, would a google+ page that is https still pass authority and link juice to the site linked in the about us tab?
Intermediate & Advanced SEO | | iAnalyst.com0 -
Google Disavow Tool - Waste of Time
My humble opinion is that Google's disavow tool.... is a utter waste of your time! My site, http://goo.gl/pdsHs was penalized over a year ago after the SEO we hired used black hat techniques to increase ranking. Ironically, while having visibility, Google itself had become a customer. (I guess the site was pretty high quality, trust worthy and user friendly enough for Google employees to purchase from.) Soon enough the message about detecting unnatural links had shown up on the webmaster tools and as expected, our rankings sank and out of view. For a year we had contacted webmasters, asking them remove links pointing back to us. 90% didn't respond, the other 10% complied). Work on our site continued, adding high quality, highly relevant unique content.
Intermediate & Advanced SEO | | Prime85
Rankings never recovered and neither did our traffic or business….. Earlier this month, we learned about Google’s "link disavow tool" and were excited! We had hoped that following the cleanup instruction, using the “link disavow tool”, we would get a chance at recovery!
We watched Matt Cutts’ video, read the various forums/blogs/topics online that were written about it, and then we felt comfortable enough to use it... We went through our backlink profile, determining which links were either spammy or seemed a result of black hat practices or the links added by a 3rd party possibly interested in our demise and added them to a .txt file. We submitted the file via the disavow tool and followed with another reconsideration request. The result came a couple of weeks later… the same cookie cutter email in the WMT suggesting that there are “unnatural links” to the site. Hope turned to disappointment and frustration. Looks like the big box companies will continue to populate the top 100 results of ANY search, the rest will help Google’s shareholders… If your site has gotten in the algorithm crosshairs, you have a better chance of recovering by changing your URL than messing around with this useless tool.0 -
Google+ Verification - Site Speed Optimization
So the Google+ Badge verifies our site for Google direct connect. However, the javascript code for the badge itself causes the page to load 3 to 4 seconds longer, which is a big deal. Any ideas for a work around?
Intermediate & Advanced SEO | | inc.com0