Substantial difference between Number of Indexed Pages and Sitemap Pages
-
Hey there,
I am doing a website audit at the moment.
I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise.
Total indexed: 2,360 (Search Consule)
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLsCheers,
Jochen -
Those discrepancies would not concern me, but there are some differences between all the things you list:
Total indexed: 2,360 Search Console - this is likely a reasonably accurate list of the number of pages you have indexed in Google. You could use a tool like URL Profiler to check index status of specific URLs.
About 2,920 results Google search "site:example.com" - site: search is less accurate and will likely return a different number each time you do it, even if it's just moments apart.
Sitemap: 1,229 URLs: these are URLs you added to a sitemap because they are priority pages you want to make sure Google has indexed and hopefully ranked. You control this number.
Screaming Frog Spider: 1,352 URLs - Screaming Frog is going to start on your homepage and crawl the site attempting to discover as many URLs as possible. If you are not linking to a page, SF won't be able to crawl it. Google on the other hand may have old pages, old URL structures or pages that were linked from an external website in their index and they won't forget them.
A really important question is: how many pages do you have that you want to be indexed? Is Google's index bloated with pages that you want to keep out? Figure these things out, and then try to adjust your sitemaps, noindex, robots.txt as needed.
-
Thanks for your reply Dmitrii,
we have excluded all query parameters in search console so this shouldn't be an issue. What is also strange is that when I try to scrape the SERPS via a site:example.com search Google is only showing a fraction (about 700) of the 2,920 results.
Cheers,
Jochen
- ★
- ★
- ☆
- ☆
- ☆
MozPoints: 810
Good Answers: 47
Endorsed Answers: 20">- ★
- ★
- ☆
- ☆
- ☆
-
Hi there.
I think that as long as rankings are good (especially historically), there is no reason to worry, because google includes in index pages, which wouldn't be in sitemap - for example pages, generated with query parameters (domain.com?x=value). Sometimes these pages do not really exist by themselves (like filters in online stores), they only exist "on the fly".
Hope this makes sense and helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Shopify Website Page Indexing issue
Hi, I am working on an eCommerce website on Shopify.
Intermediate & Advanced SEO | | Bhisshaun
When I tried Indexing my newly created service pages. The pages are not getting indexed on Google.
I also tried manual indexing of each page and submitted a sitemap but still, the issue doesn't seem to be resolved. Thanks0 -
I still see the old page in index
Hello, I have done a redirect and still see in google index my old page after 3 weeks. My new page is there also Is it normal that the old page isn't dropped for the index yet ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
For a sitemap.html page, does the URL slug have to be /sitemap?
Also, do you have to have anchors in your sitemap.html? or are naked URLs that link okay?
Intermediate & Advanced SEO | | imjonny1230 -
Google pulling brand snippets from only some of my pages. Different settings or are they just being selective?
Hi, Moz community For some of the category-pages, Google is showing some of the brands in the SERP, like this: http://www.screencast.com/t/62wldbwc
Intermediate & Advanced SEO | | Inevo
This is the page-url: https://www.gsport.no/sport/loep/lopeklaer/loepebukse For other category-pages that seemingly is built with similar code and settings, Google doesn't show brands in the snippet: http://www.screencast.com/t/zU9cg7odf
The page-url: https://www.gsport.no/sport/loep/lopeklaer/loepejakke This all begs the questions:
If the two pages contain the same code/html in terms of schema.org / rich snippets, why is Google choosing to display the brands in the SERP for only one of them? And is there something I can do in order to make them display the brands for all my pages? Thank you
Sigurd Bjurbeck, INEVO (digital agency)0 -
Google Webmaster Tools -> Sitemap suddent "indexed" drop
Hello MOZ, We had an massive SEO drop in June due to unknown reasons and we have been trying to recover since then. I've just noticed this yesterday and I'm worried. See: http://imgur.com/xv2QgCQ Could anyone help by explaining what would cause this sudden drop and what does this drop translates to exactly? What is strange is that our index status is still strong at 310 pages, no drop there: http://imgur.com/a1sRAKo And when I do search on google site:globecar.com everything seems normal see: http://imgur.com/O7vPkqu Thanks,
Intermediate & Advanced SEO | | GlobeCar0 -
Pages getting into Google Index, blocked by Robots.txt??
Hi all, So yesterday we set up to Remove URL's that got into the Google index that were not supposed to be there, due to faceted navigation... We searched for the URL's by using this in Google Search.
Intermediate & Advanced SEO | | bjs2010
site:www.sekretza.com inurl:price=
site:www.sekretza.com inurl:artists= So it brings up a list of "duplicate" pages, and they have the usual: "A description for this result is not available because of this site's robots.txt – learn more." So we removed them all, and google removed them all, every single one. This morning I do a check, and I find that more are creeping in - If i take one of the suspecting dupes to the Robots.txt tester, Google tells me it's Blocked. - and yet it's appearing in their index?? I'm confused as to why a path that is blocked is able to get into the index?? I'm thinking of lifting the Robots block so that Google can see that these pages also have a Meta NOINDEX,FOLLOW tag on - but surely that will waste my crawl budget on unnecessary pages? Any ideas? thanks.0 -
Index Pages become No-Index
Hi Mozzers, Here is the scenario: I created a landing page targeting Holiday keywords for the holiday season. The page has been crawled and indexed - I see my landing page in the SERP. However, because of the CMS layout, since the Holiday is over and I don't want it to be displayed on the homepage, i have to remove the page from hp which makes it no-index (don't ask why, it's how the CMS was built). Question: How does this affect this LP's search? Since it's already crawled and etc. will it still be on the SERP after i change the page to no-index? If I remove the no-index next year for the holiday season, how does this all play out? Any insights or information provided will be appreciated. Thank you!
Intermediate & Advanced SEO | | TommyTan0 -
Keeping the Navigation on the Sitemap HTML Page?
Hey everyone. We are about to create a sitemap.html page and have always just kept the site theme in place and put the sitemap in the "content" section of the page, with the header navigation, sidebars and footer in place. Well, now with the new "only first link counts" Google rule, wouldn't it be better to just have a "plain" html sitemap page without any other links on it?
Intermediate & Advanced SEO | | JamesO0