Discrepency between # of pages and # of pages indexed
-
Here is some background:
- The site in question has approximately 10,000 pages and Google Webmaster shows that 10,000 urls(pages were submitted)
2) Only 5,500 pages appear in the Google index
3) Webmaster shows that approximately 200 pages could not be crawled for various reasons
4) SEOMOZ shows about 1,000 pages that have long URL's or Page Titles (which we are correcting)
5) No other errors are being reported in either Webmaster or SEO MOZ
6) This is a new site launched six weeks ago. Within two weeks of launching, Google had indexed all 10,000 pages and showed 9,800 in the index but over the last few weeks, the number of pages in the index kept dropping until it reached 5,500 where it has been stable for two weeks.
Any ideas of what the issue might be? Also, is there a way to download all of the pages that are being included in that index as this might help troubleshoot?
-
It's not exactly 3 clicks... if you're a PR 10 website it will take you quite a few clicks in before it gets "tired". Deep links are always a great idea.
-
I have also heard 3 clicks from a page with link juice. So if you have deep links to a page it can help carry pages deeper in. Do you agree?
-
Thank you to all for your advice. Good suggestions.
-
We do have different types of pages but Google is indexing all category pages but not all individual content pages. Based on the replies I have received, I suspect the issue can be helped by flattening the site architecture and links.
As an FYI, the site is a health care content site so no products are sold on the site. Revenue is from ads.
-
Great tip. I have seen this happen too (e.g. forum, blog, archive and content part of the website not indexed equally).
-
Do you have areas of your site that are distinctively different in type, such as category pages and individual item pages, or individual item pages and user submitted content?
What I'm getting at is trying to find if there's a certain type of page that Google isn't indexing. If you have distinct types of pages, you can create separate site maps (one for each type of content) and see if one type of content is being indexed better than another. It's more of a diagnostics tool that a solution, but I've found it helpful for sites of that size and larger in the past.
As other people have said, it's also a new site, so the lack of links could be hindering things as well.
-
Agreed!
-
Oh yes, Google is very big on balancing and allocation of resources. I don't think 10,000 will present a problem though as this number may be too common on ecommerce and content websites.
-
Very good advice in the replies. Everyone seems to have forgotten PageRank though. In Google's random surfer model it is assumed user will at some point abandon the website (after PageRank has been exhausted). This means if your site lacks raw link juice it may not have enough to go around through the whole site structure and it leaves some pages dry and unindexed. What can help is: Already mentioned flatter site architecture and unique content, but also direct links to pages not in index (including via social media) and more and stronger links towards home page which should ideally cascade down to the rest.
-
If you don't have many links to your site yet, I think that could reduce the number of pages that Google keeps in its main index. Google may allocate less resources to crawling your site if you have very little link juice, especially if deep pages on your site have no link juice coming in to them.
Another possibility is if some of the 10,000 pages are not unique content or duplicate content. Google could send a lot of your pages to its supplemental index if this is the case.
-
If you flatten out your site architecture a bit to where all pages are no more then 3 clicks deep, and provide a better HTML sitemap you will definitely see more pages indexed. It wont be all 10k, but it will be an improvement.
-
I appreciate the reply. The HTML site map does not show all 10,000 pages and some pages are likely more than 3 deep. I will try this and see what happens.
-
Google will not index your entire 10k page site just because you submitted the links in a site map. They will crawl your site and index many pages, but most likely you will never have your entire site indexed.
Cleaning up your crawl errors will help in getting your content indexed. A few other things you can do are:
-
provide a HTML sitemap on your website
-
ensure your site navigation is solid ( i.e. all pages are reachable, no island pages, the navigation can be seen in HMTL, etc)
-
ensure you do not have deep content. Google will often only go about 3 clicks deep. If you have buried content, it won't be indexed unless it is well linked.
-
if there are any particular pages you want to get indexed, you can link to them from your home page, or ask others to link to those pages from external sites.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should search pages be indexed?
Hey guys, I've always believed that search pages should be no-indexed but now I'm wondering if there is an argument to index them? Appreciate any thoughts!
Technical SEO | | RebekahVP0 -
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
My SEO friend says my website is not being indexed by Google considering the keywords he has placed in the page and URL what does that mean?
My SEO friend says my website is not being indexed by Google considering the keywords he has placed in the page and URL what does that mean? We have added some text in the pages with keywords thats related the page
Technical SEO | | AlexisWithers0 -
Website SEO Product Pages - Condense Product Pages
We are managing a website that has seen consistently dropping rankings over the last 2 years (http://www.independence-bunting.com/). Our long term strategy has been purely content-based and is of high quality, but isn’t seeing the desired results. It is an ecommerce site that has a lot of pages, most of which are category or product pages. Many of the product pages have duplicate or thin content, which we currently see as one of the primary reasons for the ranking drops.The website has many individual products which have the same fabric and size options, but have different designs. So it is difficult to write valuable content that differs between several products that have similar designs. Right now each of the different designs has its own product page. We have a dilemma, because our options are:A.Combine similar designs of the product into one product page where the customer must choose a design, a fabric, and a size before checking out. This way we can have valuable content and don’t have to duplicate that content on other pages or try to find more to say about something that there really isn’t anything else to say about. However, this process will remove between 50% and 70% of the pages on the website. We know number of indexed pages is important to search engines and if they suddenly see that half of our pages are gone, we may cause more negative effects despite the fact that we are in fact aiming to provide more value to the user, rather than less.B.Leave the product pages alone and try to write more valuable content for each product page, which will be difficult because there really isn’t that much more to say, or more valuable ways to say it. This is the “safe” option as it means that our negative potential impact is reduced but we won’t necessarily see much positive trending either. C.Test solution A on a small percentage of the product categories to see any impact over the next several months before making sitewide updates to the product pages if we see positive impact, or revert to the old way if we see negative impact.Any sound advice would be of incredible value at this point, as the work we are doing isn’t having the desired effects and we are seeing consistent dropping rankings at this point.Any information would be greatly appreciated. Thank you,
Technical SEO | | Ed-iOVA0 -
Should i index or noindex a contact page
Im wondering if i should noindex the contact page im doing SEO for a website just wondering if by noindexing the contact page would it help SEO or hurt SEO for that website
Technical SEO | | aronwp0 -
Should I delete a page or remove links on a penalized page?
Hello All, If I have a internal page that has low quality links point to it or a penality. Can I just remove the page, and start over versus trying to remove the links? Over time wouldn't this page disapear along with the penalty on that page? Kinda like pruning a tree? Cutting off the junk limbs so other could grow stronger, or to start new fresh ones. Example: www.domain.com Penalized Internal Page: (Say this page is penalized due to keyword stuffing, and has low quality links pointing to it like blog comments, or profiles) www.domain.com/penalized-internal-page.com Would it be effective to just delete this page (www.domain.com/penalized-internal-page.com) and start over with a new page. New Internal Page: www.domain.com/new-internal-page.com I would of course lose any good links point to that page, but it might be easier then trying to remove old back links. Thoughts? Thanks! Pete
Technical SEO | | Juratovic0 -
Page rank 2 for home page, 3 for service pages
Hey guys, I have noticed with one of our new sites, the home page is showing page rank two, whereas 2 of the internal service pages are showing as 3. I have checked with both open site explorer and yahoo back links and there are by far more links to the home page. All quality and relevant directory submissions and blog comments. The site is only 4 months old, I wonder if anyone can shed any light on the fact 2 of the lesser linked pages are showing higher PR? Thanks 🙂
Technical SEO | | Nextman0 -
301 redirecting some pages directly, and the rest to a single page
I've read through the Redirect guide here already but can't get this down in my .htaccess I want to redirect some pages specifically (/contactinfo.html to the new /contact.php) And I want all other pages (not all have equivalent pages on the new site) to redirect to my new (index.php) homepage. How can I set it up so that some specific pages redirect directly, and all others go to one page? I already have the specific oldpage.html -> newpage.php redirects in place, just need to figure out the broad one for everything else.
Technical SEO | | RyanWhitney150