Crawl budget

ciznerguy

I am a believer in this concept, showing google less pages will increase their importance.

here is my question: I manage a website with millions of pages, high organic traffic (lower than before).

I do believe that too many pages are crawled. there are pages that I do not need google to crawl and followed. noindex follow does not save on the mentioned crawl budget. deleting those pages is not possible. any advice will be appreciated. If I disallow those pages I am missing on pages that help my important pages.

BlueprintMarketing

I just wrote a better reply so forgive me the file was deleted by hit post. But I do think this information will help you get acquainted much better with a crawl budget

One thing that may not incorporate your domain however is that is not centered a external link is a content delivery or CDN URL that is not rewritten to the traditional CDN.example.com Content delivery network URLs can look like

So you may see URLs like one shown here should be considered part of your domain and treated like a domain even if they look like this but these are just two examples of literally thousands CDN variants.

Examples for this would be the cost of incorporating your hostname to a HTTPS encrypted content delivery network can be very expensive.

CDN's https://varvy.com/pagespeed/content-delivery-networks.html

Crawl budget information

Tools for Crawl budget

https://www.deepcrawl.com/knowledge/events/key-takeaways-from-septembers-brightonseo/

I hope is more helpful,

Thomas

zv5l0VL.png CfJUjMf.jpg D5epQgb.png iXDlByk.png KXJFXFK.jpg

KristinaKledzik

Nope, having more or less external links will have no effect on the crawl budget spent on your site. Your crawl budget is only spent on yourdomain.com. I just meant that, Google's crawler will follow external links, but it won't until it's spent its crawl budget on your site.

Google isn't going to give you a metric showing your crawl budget, but you can assume how much it is by going to Google Search Console > Crawl > Crawl Stats. That'll show you how many pages Googlebot has crawled per day.

BeckyKey

I see, thank you.

I'm guessing if we have these external links across the site, it'll cause more harm than good for us if they use up some crawl budget on every page?

Is there anyway to find out what the crawl budget is?

KristinaKledzik

To be clear, Googlebot will find those external links and leave your site, but not until they've used their crawl budget up on your site.

max.favilli

Nope.

BeckyKey

I'd like to find out if this crawl budget applies to external sites - we link to sister companies in the footer.

Will Google bot find these external links and leave our site to go and crawl these external sites?

Thanks!

BlueprintMarketing

Using Google's parameter tools you can also reduce crawl budget issues.

https://www.google.com/webmasters/tools/crawl-url-parameters

http://www.blindfiveyearold.com/crawl-optimization

http://searchenginewatch.com/sew/news/2064349/crawl-index-rank-repeat-a-tactical-seo-framework

http://searchengineland.com/how-i-think-crawl-budget-works-sort-of-59768

max.favilli

What's the background of your conclusion? What's the logic?

I am asking because my understanding about crawl budget is that if you waste it you risk to 1) slow down recrawl frequency on a per page basis and 2) risk google crawler gives up crawling the website before to have crawled all pages.

Are you splitting your sitemap into sub sitemaps? That's a good way to spot groups/categories of pages being ignored by google crawler.

RyanPurkey

If you want to test this, identify a section of your site that you can disallow via robots.txt and then measure the corresponding changes. Proceed section by section based on results. There are so many variables at play that I don't think you'll get an answer that's anywhere as precise as testing your specific situation.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Crawl budget

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Suggested Screaming Frog configuration to mirror default Googlebot crawl?

Using "nofollow" internally can help with crawl budget?

Google Search Console Crawl Errors?

How to block search bots in crawling my site except for homepage?

Why is my site not getting crawled by google?

Best way to fix 404 crawl errors caused by Private blog posts in WordPress?

Should I let Google crawl my production server if the site is still under development?

How to remove wrong crawled domain from Google index