Crawl budget

ciznerguy

I am a believer in this concept, showing google less pages will increase their importance.

here is my question: I manage a website with millions of pages, high organic traffic (lower than before).

I do believe that too many pages are crawled. there are pages that I do not need google to crawl and followed. noindex follow does not save on the mentioned crawl budget. deleting those pages is not possible. any advice will be appreciated. If I disallow those pages I am missing on pages that help my important pages.

BlueprintMarketing

I just wrote a better reply so forgive me the file was deleted by hit post. But I do think this information will help you get acquainted much better with a crawl budget

One thing that may not incorporate your domain however is that is not centered a external link is a content delivery or CDN URL that is not rewritten to the traditional CDN.example.com Content delivery network URLs can look like

So you may see URLs like one shown here should be considered part of your domain and treated like a domain even if they look like this but these are just two examples of literally thousands CDN variants.

Examples for this would be the cost of incorporating your hostname to a HTTPS encrypted content delivery network can be very expensive.

CDN's https://varvy.com/pagespeed/content-delivery-networks.html

Crawl budget information

Tools for Crawl budget

https://www.deepcrawl.com/knowledge/events/key-takeaways-from-septembers-brightonseo/

I hope is more helpful,

Thomas

zv5l0VL.png CfJUjMf.jpg D5epQgb.png iXDlByk.png KXJFXFK.jpg

KristinaKledzik

Nope, having more or less external links will have no effect on the crawl budget spent on your site. Your crawl budget is only spent on yourdomain.com. I just meant that, Google's crawler will follow external links, but it won't until it's spent its crawl budget on your site.

Google isn't going to give you a metric showing your crawl budget, but you can assume how much it is by going to Google Search Console > Crawl > Crawl Stats. That'll show you how many pages Googlebot has crawled per day.

BeckyKey

I see, thank you.

I'm guessing if we have these external links across the site, it'll cause more harm than good for us if they use up some crawl budget on every page?

Is there anyway to find out what the crawl budget is?

KristinaKledzik

To be clear, Googlebot will find those external links and leave your site, but not until they've used their crawl budget up on your site.

max.favilli

Nope.

BeckyKey

I'd like to find out if this crawl budget applies to external sites - we link to sister companies in the footer.

Will Google bot find these external links and leave our site to go and crawl these external sites?

Thanks!

BlueprintMarketing

Using Google's parameter tools you can also reduce crawl budget issues.

https://www.google.com/webmasters/tools/crawl-url-parameters

http://www.blindfiveyearold.com/crawl-optimization

http://searchenginewatch.com/sew/news/2064349/crawl-index-rank-repeat-a-tactical-seo-framework

http://searchengineland.com/how-i-think-crawl-budget-works-sort-of-59768

max.favilli

What's the background of your conclusion? What's the logic?

I am asking because my understanding about crawl budget is that if you waste it you risk to 1) slow down recrawl frequency on a per page basis and 2) risk google crawler gives up crawling the website before to have crawled all pages.

Are you splitting your sitemap into sub sitemaps? That's a good way to spot groups/categories of pages being ignored by google crawler.

RyanPurkey

If you want to test this, identify a section of your site that you can disallow via robots.txt and then measure the corresponding changes. Proceed section by section based on results. There are so many variables at play that I don't think you'll get an answer that's anywhere as precise as testing your specific situation.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Crawl budget

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Why doesn't my website crawl by Google?

Lazy Loading of Blog Posts and Crawl Depths

We 410'ed URLs to decrease URLs submitted and increase crawl rate, but dynamically generated sub URLs from pagination are showing as 404s. Should we 410 these sub URLs?

Google crawling different content--ever ok?

Can spiders crawl javascript navigation now?

Google Webmaster successfully fetched one of my webpages. Does that mean Google will crawl them or readable by bots?

Www vs. non-www differences in crawl errors in Webmaster tools...

Website Crawl problems