Crawl Budget and Faceted Navigation
-
Hi, we have an ecommerce website with facetted navigation for the various options available.
Google has 3.4 million webpages indexed. Many of which are over 90% duplicates.
Due to the low domain authority (15/100) Google is only crawling around 4,500 webpages per day, which we would like to improve/increase.
We know, in order not to waste crawl budget we should use the robots.txt to disallow parameter URL’s (i.e. ?option=, ?search= etc..). This makes sense as it would resolve many of the duplicate content issues and force Google to only crawl the main category, product pages etc.
However, having looked at the Google Search Console these pages are getting a significant amount of organic traffic on a monthly basis.
Is it worth disallowing these parameter URL’s in robots.txt, and hoping that this solves our crawl budget issues, thus helping to index and rank the most important webpages in less time.
Or is there a better solution?
Many thanks in advance.
Lee.
-
Hello, I have also been in a similar situation. What I did was to disallow the urls with parameters using the robots.txt and place (in only the pages with parameters) the following two html tags:
This will expressly indicate to google not to index these pages. I still have some errors but I guess they will disappear in a few months.
Regards
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Important category pages that can and should be found in SERP but can not be reached by navigating on the webshop itself
Hi, On a webshop we are optimizing, the main navigation consists of the 5 main categories to which all of the products can be assigned. However, the main tabs in the navigation just activate a drop down with all of the subcategories. For example: the tab in the navigation is 'Garden equipment' and when you click on this tab, the drop down is shown with subcategories like 'Lawn mowers', 'Leaf blowers' and so on. Now, the page 'Garden equipment' is one of the main category pages and we want this page to rank of course. This shouldn't be a problem, since there is a separate URL for this page that can be indexed and that can be reached through internal links on the website. However, this page can not be reached when a visitor initially comes on the homepage of the webshop, since the tab in the navigation isn't clickable. This page will only be reached when a subcategory is selected, and then when the visitor goes back to the category page through the breadcrumb or through an internal link. Is it a problem that these important overview category pages can not be reached immediately? Thanks.
Intermediate & Advanced SEO | | Mat_C0 -
Crawling/indexing of near duplicate product pages
Hi, Hope someone can help me out here. This is the current situation: We sell stones/gravel/sand/pebbles etc. for gardens. I will take a type of pebbles and the corresponding pages/URL's to illustrate my question --> black beach pebbles. We have a 'top' product page for black beach pebbles on which you can find different types of quantities (differing from 20kg untill 1600 kg). There is not any search volume related to the different quantities The 'top' page does not link to the pages for the different quantities The content on the pages for the different quantities is not exactly the same (different price + slightly different content). But a lot of the content is the same. Current situation:
Intermediate & Advanced SEO | | AMAGARD
- Most pages for the different quantities do not have internal links (about 95%) But the sitemap does contain all of these pages. Because the sitemap contains all these URL's, google frequently crawls them (I checked the logfiles) and has indexed them. Problems: Google spends its time crawling irrelevant pages --> our entire website is not that big, so these quantity URL's kind of double the total number of URL's. Having url's in the sitemap that do not have an internal link is a problem on its own All these pages are indexed so all sorts of gravel/pebbles have near duplicates. My solution: remove these URL's from the sitemap --> that will probably stop Google from regularly crawling these pages Putting a canonical on the quantity pages pointing to the top-product page. --> that will hopefully remove the irrelevant (no search volume) near duplicates from the index My questions: To be able to see the canonical, google will need to crawl these pages. Will google still do that after removing them from the sitemap? Do you agree that these pages are near duplicates and that it is best to remove them from the index? A few of these quantity pages do have intenral links (a few procent of them) because of a sale campaign. So there will be some (not much) internal links pointing to non-canonical pages. Would that be a problem? Thanks a lot in advance for your help! Best!1 -
Realtor site with external links in navigation
I have a client with a realtor site that uses IDX for the listings feed. We have several external links going over to the IDX site for various live custom searches (ie: luxury listings, waterfront listings, etc...). We are getting a Moz spam ranking of 2/7 for both "Large Number of External Links" and "External Links in Navigation". Chances are, these are related. My question is this: (1) Being the score is only 2/7, should I bother with fixing this? (2) If I add a rel="nofollow" to all the site-wide links (in header, footer & menu) will this help? I couldn't find anything definitive in the Q&A search. Looking forward to any insights!!!
Intermediate & Advanced SEO | | lcallander1 -
Mega Menu Navigation Best Practice
First off, I'm a landscape/nature/travel photographer. I mainly sell prints of my work. I'm in the process of redesigning my website, and I'm trying to decide whether to keep the navigation extremely simple or leave the drop-down menu for galleries. Currently, my navigation is something like this: Galleries
Intermediate & Advanced SEO | | shannmg1
> Gallery for State or Country (example: California)
> Sub-region in State or Country (example: San Francisco)
Blog
Prints
About
Contact Selling prints is the top priority of the website, as that's what runs the business. I have lots of blog content, and I'm starting to build some good travel advice, etc. but in reality, the galleries, which then filter down to individual pages for each photo with a cart system, are the most important. What I'm struggling to decide is whether to leave the sort of "mega menu" for the galleries, or to do away with them, and have the user go to the overall galleries page to navigate further into the site. Leaving the mega menu intact, the galleries page becomes a lot less important, and takes out a step to get to the shopping cart. However, I'm wondering if the amount of galleries in the drop down menu is giving TOO many choices up front as well. I also wonder how changing this will affect search. Any thoughts on which is better or is it really just a matter of preference?0 -
After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
Hi Everyone,
Intermediate & Advanced SEO | | AMA-DataSet
A few weeks ago now I received a "Googlebot can't access your site..... connection failure rate is 7.8%" message from the webmaster tools, I have since fixed the majority of these issues but iv noticed that all page except the main home page now have a page rank of N/A while the home page has a page rank of 5 still. Has this connectivity issues reduced the page ranks to N/A? or is it something else I'm missing? Thanks in advance.0 -
Are links to on-page content crawled / have any effect on page rank?
Lets say I have a really long article that begins with links to <a name="something">anchors on the same page.</a> <a name="something"></a> <a name="something">E.g.,</a> Chapter 1, Chapter 2, etc, allowing the user to scroll down to different content. There are also other links on this page that link to other pages. A few questions: Googlebot arrives on the page. Does it crawl links that point to anchors on the same page? When link juice is divided among all the links on the page, do these links count and page rank is then lost? Thanks!
Intermediate & Advanced SEO | | anthematic0 -
How to prevent Google from crawling our product filter?
Hi All, We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl. On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless. In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls. We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway. What can we do to prevent Google from crawling all the filter options? Thanks in advance for the help. Kind regards, Gerwin
Intermediate & Advanced SEO | | footsteps0