Help Crawl friendliness for large site
-
After watching Rand's video I am trying to think of the best way to make my large site more crawl friendly.
Background
I have a large site with over 100k product skus and so when you get to a particular page of products there are tons of different refinements and options that help you sort the products. Most of these are noindex followed, but I was wondering if I should be nofollowing the internal links as well in order to keep bots out of those pages and going to the pages that I want them to go too. Is this a good way to handle it?
Also, does anyone have good recommendations of links to posts that deal with helping the crawl friendliness of a large site?
Thanks!
-
Good point. If you don't want the filter pages crawled at all, it would be better to just block them via robots.txt. My preferred approach is to use query parameters for filters, and canonicaling the filtered pages back to the original, unfiltered page.
Another approach is to use AJAX to dynamically filter the page. This takes more programming overhead, but won't result in tons of extra pages being crawled and potentially indexed.
-
Nofollowing internal links is almost never a good idea. You're just wasting valuable link juice.
Google actually just recently came out with a good guide for how to handle ecommerce navigation with lots of product options: http://googlewebmastercentral.blogspot.com/2014/02/faceted-navigation-best-and-5-of-worst.html
Also, if you have a lot of categories in you store, try to show navigation that is only relevant to the section of the store the user is in. For example, if the user is in the Flowers section, don't show a ton of links for Cellphones.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz Crawled My Site. Now What?
Hey everyone! So Moz crawled my site and I passed it over to my dev team who's curious about what they should prioritize. Curious what everyone's thoughts are. Here are the issue types: Duplicate Content - Missing Title - Duplicate Title Tag - Redirect Chain - Title too long - Description too short - Missing Description - Missing h1 - Thin Content - URL Too Long - Has meta noindex Would love any assistance! Thank you!
Technical SEO | | inksoft_mm0 -
Please Help! Crawl & Site Errors - Will This Impact My SEO?
Hello Moz, I need urgent help. I remove a tonne of product pages and put everything into one product page to deal with duplicate content. I thought this was a good thing to do until I got an email from Google saying: "Googlebot identified a significant increase in the number of URLs on ****.com that return a 404 (not found) error. " I checked it out and found the problem: 4 Soft 404's
Technical SEO | | crocman
41 Not Found's What do I need to do to fix this? Is it a problem or should I just ignore? I removed all the pages on WordPress but I need to do it somehow manually through Google? I have worked so hard on my SERP's that this will destroy me if I'm penalised. Please can someone advise?0 -
Crawl Test Report only shows home page and no inner site pages?
Hi, My site is [removed] When I first tried to set up a new campaign for the site, I received the error: Roger has detected a problem: We have detected that the root domain [removed] does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. I then ran a Crawl Test per the FAQ. The SEOmoz crawl report only shows my home page URL and does not have any inner site pages. This is a Joomla site. What is the problem? Thanks! Dave
Technical SEO | | crave810 -
Internal Ads on A Site
We serve ads on our site using a sub-domain. All ads use a re-direct from ads.domain before redirecting users to the proper, normal, internal url. Most the content on our home page is ad block driven. Is it possible and does it make sense to enter the sub-domain as url parameter in Google Webmaster tools, letting Google know that this is something to be ignored. Many thanks
Technical SEO | | CeeC-Blogger0 -
Why does my site rank so badly
its my turn to ask the interminable question why does my site rank so badly? site is: marriagerecords.org.uk. it was #1 for 'marriage records' on google for about 6 months. then it was 5th to 10th for about 2 months. now it is nowhere for this phrase and anything else, none of the pages I have written rank for anything. I have spent hours upon hours researching original content and I have got some great backlinks from sites like wrexham.gov.uk and somerset.gov.uk (some dont show in opensiteexplorer yet). im guessing im over-optimizing something but i'd love some concrete fixes if anyone could suggest any. thanks, tom
Technical SEO | | lethal0r0 -
No crawl code for pages of helpful links vs. no follow code on each link?
Our college website has many "owners" who want pages of "helpful links" resulting in a large number of outbound links. If we add code to the pages to prevent them from being crawled, will that be just as effective as making every individual link no follow?
Technical SEO | | LAJN0 -
On-site adjustment opinions
Hi folks, I've got a fairly interesting scenario. I'm trying to rank this page (http://www.staysa.co.za/sa/1-2-0-0-1/East-London/accommodation) better for the term, "accommodation east london". The client isn't keen on making many changes and it was built horribly with ASP, half CMS, half not. I have made the following changes today: I introduced two paragraphs of text below the H1 tag. I changed "East London Bed and Breakfast", "East London Conference Venues", "East London Cottage / Chalet" to just "Bed and Breakfast", "Conference Venues", "Cottage / Chalet" as the continual key phrase duplication in my experience is a bad move. I've made a change to the title tag (this is a huge mission as it's not CMS controlled, so I had to teach myself some basic ASP to do so). Meta data.. nightmare to change unfortunately, at least not without rewriting part of the CMS. I'm wondering, are there any other on-site factors that I'm missing? I'm not a fan of site-wide links, so I don't want to put an exact match anchor text link from the sidebar/footer to the page, not unless someone can motivate why I should. Keen to hear everyone's opinions 🙂
Technical SEO | | ChristopherM0 -
How do you find bad links to your site?
My website has around 900 incoming links and I have a Google 50 penalty that is sitewide. I have been doing research and from what I can see is that the 50 penalty is usually associated with scetchy links. The penalty started last year. I had about 40 related domains to my main site and each had a simple one page site with a link to the main site. (I know I screwed up) I cleaned up all of those links by removing them. The single page site still exist, but they have no links and several of them still rank very well. I also had an outside SEO person that bought a few links. I came clean with Google and told them everything. I gave them all of my sites and that the SEO person had bought links. I gave them full disclosure and removed everything. I have one site that I can't get the link removed from. I have contacted them numerous times to remove the link and I get no response. I am curious if anyone has had a simular experience and how they corrected the situation. Another issue is that my site is "thin" because its an ecommerce affiliate site and full of affiliate links. I work in the costume market. I'm also afraid that I have other bad links pointing to my site. Dooes anyone know of a tool to identify bad links that Google may be penalizing me for at this time. Here is Google's latest denial of my reconsideration request. Dear site owner or webmaster of XXXXXXXXX.com. We received a request from a site owner to reconsider XXXXXXXX.com for compliance with Google's Webmaster Guidelines. We've reviewed your site and we believe that some or all of your pages still violate our quality guidelines. In order to preserve the quality of our search engine, pages from XXXXXXXXXX.com may not appear or may not rank as highly in Google's search results, or may otherwise be considered to be less trustworthy than sites which follow the quality guidelines. If you wish to be reconsidered again, please correct or remove all pages that are outside our quality guidelines. When such changes have been made, please visit https://www.google.com/webmasters/tools/reconsideration?hl=en and resubmit your site for reconsideration. If you have additional questions about how to resolve this issue, please see our Webmaster Help Forum for support. Sincerely, Google Search Quality
Technical SEO | | tadden0