How far into a page will a spider crawl to look for text?
-
How far into a page will a spider crawl to look for text? I've heard a spider will only crawl the first 3kb, but can't find an authoritative source for that information.
-
Far, far more than 3kb. Somewhere halfway this blog (http://www.finishjoomla.com/blog/41/does-source-code-ordering-still-matter-for-seo/) you'll find some references to sources on this same issue, they might be helpful for you.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is using JavaScript injected text in line with best practice on making blocks of text non-crawlable?
I have an ecommerce website that has common text on all the product pages, e.g. delivery and returns information. Is it ok to use non-crawlable JavaScript injected text as a method to make this content invisible to search engines? Or is this method frowned upon by Google? By way of background info - I'm concerned about duplicate/thin content, so want to tackle this by reducing this 'common text' as well as boosting unique content on these pages. Any advice would be much appreciated.
Technical SEO | | Coraltoes770 -
Banned Page
I have been using a 3rd party checker on indexed pages in google. It has shown several banned pages. I type the page in and it comes up. But it is nowhere to be found for me to delete it. It is not in the wordpress pages. It also shows up in the duplicate content section in my campaigns in moz.com. I can find the page to delete it. If it is banned then I do not want to redirect it to the correct page. Any ideas on how to fix this?
Technical SEO | | Roots70 -
How to determine which pages are not indexed
Is there a way to determine which pages of a website are not being indexed by the search engines? I know Google Webmasters has a sitemap area where it tells you how many urls have been submitted and how many are indexed out of those submitted. However, it doesn't necessarily show which urls aren't being indexed.
Technical SEO | | priceseo1 -
Find where the not selected pages are from
Hi all Can anyone suggest how I can find where gtoogle is finding approx. 1000 pages not to select? In round numbers I have 110 pages on the site site: searech shows all pages index status shows 110 slected and 1000 not selected. For the life of me I cannot fingure where these pages are coming from. I have set my prefered domain to www., setup 301 's to www. as per below RewriteCond %{HTTP_HOST} ^growingyourownveg.com$
Technical SEO | | spes123
RewriteRule ^(.*)$ "http://www.growingyourownveg.com/$1" [R=301,L] site is www.growingyourownveg.com any suggestions much appreciated Simon0 -
Getting More Pages Indexed
We have a large E-commerce site (magento based) and have submitted sitemap files for several million pages within Webmaster tools. The number of indexed pages seems to fluctuate, but currently there is less than 300,000 pages indexed out of 4 million submitted. How can we get the number of indexed pages to be higher? Changing the settings on the crawl rate and resubmitting site maps doesn't seem to have an effect on the number of pages indexed. Am I correct in assuming that most individual product pages just don't carry enough link juice to be considered important enough yet by Google to be indexed? Let me know if there are any suggestions or tips for getting more pages indexed. syGtx.png
Technical SEO | | Mattchstick0 -
Page Over-optimized?
I read over this post on the blog tonight: http://www.seomoz.org/blog/lessons-learned-by-an-over-optimizer-14730 & it's got me concerned that I might be having a similar issue on our site? Back in March & April of last year, we ranked fairly well for a number of long tail keywords, here is one in particular 'Mio Drink' for this page: http://www.discountqueens.com/free-mio-drink-from-kraft-facebook-offer The page is still indexed, but appears back on page #3 for the search term. During this time we had made a number of different updates to our site & I can't seem to put an exact finger on what might have caused the problem? Can anyone see any issues that might have caused this to drop? Thanks, BJ
Technical SEO | | seointern0 -
Getting a bunch of pages re-crawled?
I added noindex tags to a bunch (1,000+) of paginated category pages on my site. I want Google to recrawl the pages so they will de-index them. Any ideas to speed up the process?
Technical SEO | | AdamThompson0 -
Duplicate Page Content
Hi within my campaigns i get an error "crawl errors found" that says duplicate page content found, it finds the same content on the home pages below. Are these seen as two different pages? And how can i correct these errors as they are just one page? http://poolstar.net/ http://poolstar.net/Home_Page.php
Technical SEO | | RouteAccounts0