Webmaster Tools Indexed pages vs. Sitemap?
-
Looking at Google Webmaster Tools and I'm noticing a few things, most sites I look at the number of indexed pages in the sitemaps report is usually less than 100% (i.e. something like 122 indexed out of 134 submitted or something) and the number of indexed pages in the indexed status report is usually higher. So for example, one site says over 1000 pages indexed in the indexed status report but the sitemap says something like 122 indexed.
My question: Is the sitemap report always a subset of the URLs submitted in the sitemap? Will the number of pages indexed there always be lower than or equal to the URLs referenced in the sitemap?
Also, if there is a big disparity between the sitemap submitted URLs and the indexed URLs (like 10x) is that concerning to anyone else?
-
Unfortunately not, the closest you'll get is selecting a long period of time in Analytics and then exporting all the pages that received organic search traffic. If you could then cross check them with your list of URLs on your site it could provide you with a small list. But I would still check them in Google to make sure they aren't indexed. As I said it's not the best way.
-
Is there a reliable way to determine which pages have not been indexed?
-
Great answer by Tom already, but I want to add that probably images and other types of content whom are mostly not by default included in sitemaps could also be among the indexed 'pages'.
-
There's no golden rule that your sitemap > indexed pages or vice versa.
If you have more URLs in your sitemap than you have indexed pages, you want to look at the pages not indexed to see why that is the case. It could be that those pages have duplicate and/or thin content, and so Google is ignoring them. A canonical tag might be instructing Google to ignore them. Or the pages might be off the site navigation and are more than 4 links/jumps away from the homepage or another page on the site, make them hard to find.
Conversely, if you had lots more pages indexed than in your sitemap, it could be a navigation or URL duplication problem. Check to see if any of the pages are duplicate versions caused by things like dynamic URLs generated through search on the site or the site navigation, for example. If those pages are the only physical pages that you have created and you know every single one has been submitted in a sitemap - and so any other indexed URLs would be unaccounted for, that may well be cause for concern, so check nothing is being indexed multiple times.
Just a couple of scenarios, but I hope it helps.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does the new Google Analytics Search Console Beta tool use API to pull more data?
So my client has been asking for definitive proof of why the search query data provided on Google Search Console does not exactly match up the data presented directly in the Search Console itself. The simple answer is that the Google Search Console is limited to 1000 rows of data. However our client is requesting a Google article/documentation of why the new Search Console beta tool has no row limit (hence much more data for a big website). I know that the Google Search Console API was available before Google announced the new Search Console Beta tool in Google Analytics. I also know this API could pull in more data than the 1000 row limit. However is there any article available (preferably from Google) that Google Analytics is pulling this Search Console data via API? Thanks!
Reporting & Analytics | | RosemaryB0 -
How to set goal in Google Analytics that required specific page
So our company has new page that has just implemented (let say "page x" --> not a landing page) and we want to see how many visitors that through "page x " convert into the goal (let say "page y"). If I just make the goal destination like "/page y" the goal number that appear is ALL the visitors who reach "page y" (through or not through "page x"), so how I set the goal setting to only show the visitors who reach "page y" through "page x" ? Thank you
Reporting & Analytics | | ddspg0 -
Google is not indexing all URLs
My website have company and events profile from 200 countries. So it does have lots of URL. Earlier in August 2014, Google used to crawl 90% of URLs we submit. Thing goes wrong when we shifted from http to https. We lost traffic. But we are gaining it slowly. Main concern is that, It still does not indexed all submitted URLs. It have crawled merely 8% of all URLs submitted. site address is businessvibes.com Any help would be appreciated.
Reporting & Analytics | | irteam0 -
Submitting an 'HTTPS' sitemap.xml to Bing
I have been trying to submit my sitemap to Bing [via their webmaster tools] for well over a week and it continues to report 'pending' My site is HTTPS and the sitemap is accepted by Google. I questioned Bing about this and got this response: To set your expectations, our Sitemap fetchers use a different pipeline and because of this, we cannot crawl Sitemaps in HTTPS format. We require that you submit an HTTP version of sitemap in order for Bing to properly crawl the file. Please go ahead and delete the current Sitemap and resubmit a new one in HTTP. Currently I don't and can't have a HTTP version of my site & sitemap and my developers are telling me that 3hrs worth of dev time will go into coming up with a work-around which I'm not sure I want to invest in [I have more important things to concentrate my spend on!]. Has anyone been faced with this problem and is there any quick/cheap alternative or do I just accept that Bing won't crawl my site until they update their end?!
Reporting & Analytics | | cityxplora.com0 -
Verifying Site Ownership & Setting Up Webmaster tools for clients who use Hubspot
We are a Hubspot partner agency. I'm trying to find the best route for managing Google's tools as an extra resource for insight, not the primary basis for marketing effort. I also want to explore adwords in more depth. Finding a lot of our clients don't have one or the other or both Analytics/Webmaster tools in place. Can I verify site ownership to set up webmaster tools simply by having admin access to their analytics account or will that require ownership of the analytics account? With Google merging things together these days I'm not sure of the best approach to take. Usually clients have their site hosted somewhere and built on some platform and ADD a Hubspot blog and the landing pages/cta's, Hubspot tools on a subdomain hosted by Hubspot. Hubspot has tools in it's website settings for adding google analytics (actually it's just a field to add code to the header area). If a client has universal analytics on their primary domain do I still need to go and add a separate analytics property for the subdomain and go through Hubspot's tools to install it on the subdomain? Or just use the same code from their primary domain and add it to the Hubspot header? What is the best route? Any additional thoughts on this subject are welcome - with so much updating and changing coming from Google (and Hubspot as we implement 3.0 - COS) I'm trying to avoid wasted effort, outdated methods, etc. Thanks!
Reporting & Analytics | | rhgraves651 -
Have You Ever Gotten False or Odd Notifications from Google Webmaster Tools?
I just got an email for one of my client websites from Google Webmaster Tools letting me know that the Google Bot could not access the website? The notification indicated that there may be a robots.txt issue that prevented crawling, but that they would try something to crawl it again. In this situation, we had just created a new Webmaster Tools account for the client, and there were (are) two GWT accounts now for the website, one that is incorrectly set to http://clientwebsite.com and the other that is set to www.clientwebsite.com. One of them reports zero errors, and the other reports an odd crawl error. Anyone encountered this type of thing before?
Reporting & Analytics | | williammarlow0 -
Why did my home page fall off of google rankings?
My home page at www.smt-associates.com has been ranked well for various key word phrases for years. I've tried to optimize it for the search "Crystal Lake CPA Firm" and it always had ranked number 1-2. Now it doesn't even rank in the top 5 pages (actually I don't know which page it falls on). I did an on-page report card and it has an A rating. So, what is preventing Google from ranking my home page on page 1? There's not that much competition so this should be an easy ranking for me. I don't know how ling this has not been listed, but I did modify my site about 12-18 months ago with a new WP theme. Could the theme be the problem?
Reporting & Analytics | | smtcpa0 -
Increase number of pages crawled
Only one page is being crawled, how do I increase the number to include most of our site?
Reporting & Analytics | | NorthCoast0