Crawl budget

ciznerguy

I am a believer in this concept, showing google less pages will increase their importance.

here is my question: I manage a website with millions of pages, high organic traffic (lower than before).

I do believe that too many pages are crawled. there are pages that I do not need google to crawl and followed. noindex follow does not save on the mentioned crawl budget. deleting those pages is not possible. any advice will be appreciated. If I disallow those pages I am missing on pages that help my important pages.

BlueprintMarketing

I just wrote a better reply so forgive me the file was deleted by hit post. But I do think this information will help you get acquainted much better with a crawl budget

One thing that may not incorporate your domain however is that is not centered a external link is a content delivery or CDN URL that is not rewritten to the traditional CDN.example.com Content delivery network URLs can look like

So you may see URLs like one shown here should be considered part of your domain and treated like a domain even if they look like this but these are just two examples of literally thousands CDN variants.

Examples for this would be the cost of incorporating your hostname to a HTTPS encrypted content delivery network can be very expensive.

CDN's https://varvy.com/pagespeed/content-delivery-networks.html

Crawl budget information

Tools for Crawl budget

https://www.deepcrawl.com/knowledge/events/key-takeaways-from-septembers-brightonseo/

I hope is more helpful,

Thomas

zv5l0VL.png CfJUjMf.jpg D5epQgb.png iXDlByk.png KXJFXFK.jpg

KristinaKledzik

Nope, having more or less external links will have no effect on the crawl budget spent on your site. Your crawl budget is only spent on yourdomain.com. I just meant that, Google's crawler will follow external links, but it won't until it's spent its crawl budget on your site.

Google isn't going to give you a metric showing your crawl budget, but you can assume how much it is by going to Google Search Console > Crawl > Crawl Stats. That'll show you how many pages Googlebot has crawled per day.

BeckyKey

I see, thank you.

I'm guessing if we have these external links across the site, it'll cause more harm than good for us if they use up some crawl budget on every page?

Is there anyway to find out what the crawl budget is?

KristinaKledzik

To be clear, Googlebot will find those external links and leave your site, but not until they've used their crawl budget up on your site.

max.favilli

Nope.

BeckyKey

I'd like to find out if this crawl budget applies to external sites - we link to sister companies in the footer.

Will Google bot find these external links and leave our site to go and crawl these external sites?

Thanks!

BlueprintMarketing

Using Google's parameter tools you can also reduce crawl budget issues.

https://www.google.com/webmasters/tools/crawl-url-parameters

http://www.blindfiveyearold.com/crawl-optimization

http://searchenginewatch.com/sew/news/2064349/crawl-index-rank-repeat-a-tactical-seo-framework

http://searchengineland.com/how-i-think-crawl-budget-works-sort-of-59768

max.favilli

What's the background of your conclusion? What's the logic?

I am asking because my understanding about crawl budget is that if you waste it you risk to 1) slow down recrawl frequency on a per page basis and 2) risk google crawler gives up crawling the website before to have crawled all pages.

Are you splitting your sitemap into sub sitemaps? That's a good way to spot groups/categories of pages being ignored by google crawler.

RyanPurkey

If you want to test this, identify a section of your site that you can disallow via robots.txt and then measure the corresponding changes. Proceed section by section based on results. There are so many variables at play that I don't think you'll get an answer that's anywhere as precise as testing your specific situation.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Crawl budget

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console

Download all GSC crawl errors: Possible today?

Seomoz crawl reporting thousands of 302 redirects?

Page Crawling Check after Modification Done without staying 7 days

Could you use a robots.txt file to disalow a duplicate content page from being crawled?

I'm pulling my hair out trying to figure out why google stopped crawling.. any help is appreciated

Why do old URL format are still being crawled by Rogerbot?

How to prevent Google from crawling our product filter?