Increase in pages crawled per day
-
What does it mean when GWT abruptly jump from 15k to 30k pages crawled per day?
I am used to see spikes, like 10k average and a couple of time per month 50k pages crawled.
But in this case 10 days ago moved from 15k to 30k per day and it's staying there. I know it's a good sign, the crawler is crawling more pages per day, so it's picking up changes more often, but I have no idea of why is doing it, what good signals usually drive google crawler to choose to increase the number of pages crawled per day?
Anyone knows?
-
Nice find Ryan.
-
Agreed. Especially since Google's own Gary Illyes respond to the following with:
How long is the delay between making it mobile friendly and it being reflected in the search results?
Illyes says “As soon as we discover it is mobile friendly, on a URL by URL basis, it will be updated.
Sounds like when you went responsive they double checked each URL to confirm. From: http://www.thesempost.com/googles-gary-illyes-qa-upcoming-mobile-ranking-signal-change/. Cheers!
-
I usually analyze backlinks with both gwt and ahrefs, and ahrefs also doesn't show any abnormally high DA backlink either.
Agree the responsive change is the most probable candidate, I have a couple of other websites I want to turn responsive before April 21st, that's an opportunity to test and see if that is the reason.
-
Ah, the responsive change could be a big part of it. You're probably getting crawls from the mobile crawler. GWT wouldn't be the best source for the recency on backlinks. I'd actually look for spikes via referrers in Analytics. GWT isn't always that responsive when reporting links. Still, it looks like the responsive redesign is a likely candidate for this, especially with Google's looming April 21st deadline.
-
Tw things I forgot to mention are:
- something like 2 weeks ago we turned the website responsive, could it be google mobile crawler is increasing the number of crawled pages, I have to analyze the logs to see if the requests are coming from google mobile crawler
- the total number of indexed pages didn't change, which make me wonder if a rise in the number of crawled pages per day is all that relevant
-
Hi Ryan,
- GWT (Search Traffic->Search Queries) shows a drop of 6% in impressions for brand based searches (google trends shows a similar pattern).
- GWT is not showing any recent backlink with an abnormally high DA.
- we actually had a couple of unusually high traffic from Facebook thanks to a couple of particularly successful post, but we are talking about a couple of spikes of just 5k visits and they both started after the rise of pages crawled per day.
If you have any other idea it's more than welcome, I wish I could understand the source of that change to be able to replicate it on other websites.
-
I am not sure I understand what you mean, that website has a total of 35k pages submitted through sitemap to GWT, of which only 8k are indexed. The total number of pages indexed have always been slowly increasing through time, it moved from 6k to 8k in the last couple of months, slowly with no spikes.
That's not the total number of pages served by the site, since dynamics search results page amount to around 150k total pages, we do not submit all of them in the sitemap on purpose, and GWT shows 70k pages as the total number of indexed pages.
I analyzed Google crawler activity through server logs in the past, it does pick a set of (apparently) random pages every night and does crawl them. I actually never analyzed what percentage of those pages are in the sitemap or not.
Internal link structure was built on purpose to try to favor ranking of pages we considered more important.
The point is we didn't change anything in the website structure recently. User generated content have been lowering duplicate pages count, slowly, through time, without any recent spike. We have a PR campaign which is increasing backlinks with an average rate of around 3 links per week, and we didn't have any high DA backlinks appearing in the last few weeks.
So I am wondering what made google crawler start crawling much more pages per day.
-
yes, I updated to parameters just before you posted
-
When you say URL variables do you mean query string variables like ?key=value
That is really good advice. You can check in your GWT. If you let google crawl and it runs in to a loop it will not index that section of your site. It would be costly for them.
-
I would also check you have not got a spike of URL parameters becoming available. I recently had a similar issue and although I had these set up in GWT the crawler was actively wasting its time on them. Once I added to robots the crawl level went back to 'normal'.
-
There could be several factors... maybe your brand based search is prompting Google to capture more of your site. Maybe you got a link from a very high authority site that prompts higher crawl volumes. Queries that prompt freshness related to your site could also spur on Google. It is a lot of guesswork, but can be whittled down some by a close look at Analytics and perhaps tomorrows OSE update (Fresh Web Explorer might provide some clue's in the meantime.) At least you're moving in the right direction. Cheers!
-
There are two variables in play and you are picking up on one.
If there are 1,000 pages on your website then Google may index all 1,000 if they are aware of all the pages. As you indicated, it is also Google's decision how many of your pages to index.
The second factor which is most likely the case in your situation is that Google only has two ways to index your pages. One is to submit a sitemap in GWT to all of your known pages. So Google would then have a choice to index all 1,000 as it would then be aware of their existence. However, it sounds like your website is relying on links. If you have 1,000 pages and a home page with one link leading to an about us page then Google is only aware of two pages on your entire website. Your website has to have a internal link structure that Google can crawl.
Imagine your website like a tree root structure. For Google to get to every page and index it then it has to have clear, defined, and easy access. Websites with a home page that links to a page A that then links to page B that then links to page C that then links to page D that then links to 500 pages can easily lose 500 pages if there is an obstruction between any of the pages that lead to page D. Because google can't crawl to page D to see all the pages on it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why are two different pages showing for the same keyword in every alternative day?
Hello Everyone, I was really confused with one of my clients search results for a single keyword. One day the home page shows up for the keyword and the other day pricing page shows in search results. I made sure that there is no keyword cannibalization and also more backlinks are pointed towards home page. But still the pricing page shows up for every alternate day. I even checked Google analytics and the most visited page is home page and not the pricing page. Also, when the pricing page shows up it will be there on 2nd page of Google search results and when the home page shows up it is going to the 4th page of search results. Please help me in figuring out this issue. Thanks
Technical SEO | | sandeep.clickdesk0 -
Google showing https:// page in search results but directing to http:// page
We're a bit confused as to why Google shows a secure page https:// URL in the results for some of our pages. This includes our homepage. But when you click through it isn't taking you to the https:// page, just the normal unsecured page. This isn't happening for all of our results, most of our deeper content results are not showing as https://. I thought this might have something to do with Google conducting searches behind secure pages now, but this problem doesn't seem to affect other sites and our competitors. Any ideas as to why this is happening and how we get around it?
Technical SEO | | amiraicaew0 -
If the order of products on a page changes each time the page is loaded, does this have a negative effect on the SEO of those pages?
Hello, a client of mine has a number of category pages that each have a list of products. Each time the page is reloaded the order of those products changes. Does this have a negative effect on the pages' rankings? Thank you
Technical SEO | | Kerry_Jones2 -
Banned Page
I have been using a 3rd party checker on indexed pages in google. It has shown several banned pages. I type the page in and it comes up. But it is nowhere to be found for me to delete it. It is not in the wordpress pages. It also shows up in the duplicate content section in my campaigns in moz.com. I can find the page to delete it. If it is banned then I do not want to redirect it to the correct page. Any ideas on how to fix this?
Technical SEO | | Roots70 -
Does google like Category pages or pages with lots of Products on them?
We are having an issue with getting Google to rank the page we want. To have this page http://www.jakewilson.com/c/52/-/346/Cruiser-Motorcycle-Tires rank for the key word Cruiser Motorcycle Tires; however, this page http://www.jakewilson.com/t/52/-/343/752/Cruiser-Motorcycle-Tires is ranking instead and it has less links and page authority according to site explorer and it is farther down in the hierarchy. I am wondering if google just likes pages that have actual products on them instead of a page leading to the page with all the products. Thoughts?
Technical SEO | | DoRM0 -
Unreachable Pages
Hi All Is there a tool to check a website if it has stand alone unreachable pages? Thanks for helping
Technical SEO | | Joseph-Green-SEO0 -
How can the search engines can crawl my java script generated web pages
For example when I click in a link of this movie from the home page, the link send me to this page http://www.vudu.mx/movies/#!content/293191/Madagascar-3-Los-Fugitivos-Madagascar-3-Europes-Most-Wanted-Doblada but in the source code I can't see the meta tittle and description and I think the search engines wont see that too, am I right? I guess that only appears the source code of that "master template" and that it is not usefull for me. So, my question is, how can I add dynamically this data to every page of each movie to allow crawl all the pages to the search engines? Thank you.
Technical SEO | | mobile3600 -
Homepage dropping back to page 30 and being replaced by a random page?
Hi All Please accept my apologies if i have posted this in the wrong place, i am new to this. I have asked for help over and over again on Google Webmaster Forum but everytime i am faced with sarcastic, unhelpful answers and then moaned at for asking the same question again when i get no answers. Well, my website is http://www.hillfieldscampingandleisure.co.uk. The site is nearly 2 years old and is an ecommerce online camping equipment store. It is hosted on the EKMPOWERSHOP Platform. After a about a year of adding products and designing my site i decided to hire an SEO Company based in the UK, they were a good company with some big clients. Anyways to cut a really long story short....they completely ripped me off by £700 a month for 7 months for my site to keep going backwards, they wouldnt target the keywords i wanted and all they did was provide really spammy, non relevant, no page rank links...my site ended up on number 31 of Google. I managed to drop the company and try to do things myself. I optimized my sites content so it wasn't keyword stuffed I re-wrote all my alt tags to look more natural I optimized my meta and h1 tags I carried on with trying to build relevant, high page rank links Anyways i managed to get my homepage to page 3/4 of Google. It stayed there for a few weeks but over the past few weeks my Homepage is dropping back to page 28-30 and being replaced with a random page of my site on page 4-6. It corrects itself after a while and my homepage returns but then it happens all over again....today i have a random page on page 4 and my homepage is on page 29. Any ideas on what is causing this and how can i get my site up there? I have had some ideas come back that it is the EKM platform i am using but since the seo company took the p out of me, its the only one i can afford at the moment until i start selling. I am a small business with stock waiting to be sold but no matter how much i read and rules to follow my site just doesn't seem to move. Any help would be really really apreciated and be nice!
Technical SEO | | hillfields0