Webmaster Tools Indexed pages vs. Sitemap?
-
Looking at Google Webmaster Tools and I'm noticing a few things, most sites I look at the number of indexed pages in the sitemaps report is usually less than 100% (i.e. something like 122 indexed out of 134 submitted or something) and the number of indexed pages in the indexed status report is usually higher. So for example, one site says over 1000 pages indexed in the indexed status report but the sitemap says something like 122 indexed.
My question: Is the sitemap report always a subset of the URLs submitted in the sitemap? Will the number of pages indexed there always be lower than or equal to the URLs referenced in the sitemap?
Also, if there is a big disparity between the sitemap submitted URLs and the indexed URLs (like 10x) is that concerning to anyone else?
-
Unfortunately not, the closest you'll get is selecting a long period of time in Analytics and then exporting all the pages that received organic search traffic. If you could then cross check them with your list of URLs on your site it could provide you with a small list. But I would still check them in Google to make sure they aren't indexed. As I said it's not the best way.
-
Is there a reliable way to determine which pages have not been indexed?
-
Great answer by Tom already, but I want to add that probably images and other types of content whom are mostly not by default included in sitemaps could also be among the indexed 'pages'.
-
There's no golden rule that your sitemap > indexed pages or vice versa.
If you have more URLs in your sitemap than you have indexed pages, you want to look at the pages not indexed to see why that is the case. It could be that those pages have duplicate and/or thin content, and so Google is ignoring them. A canonical tag might be instructing Google to ignore them. Or the pages might be off the site navigation and are more than 4 links/jumps away from the homepage or another page on the site, make them hard to find.
Conversely, if you had lots more pages indexed than in your sitemap, it could be a navigation or URL duplication problem. Check to see if any of the pages are duplicate versions caused by things like dynamic URLs generated through search on the site or the site navigation, for example. If those pages are the only physical pages that you have created and you know every single one has been submitted in a sitemap - and so any other indexed URLs would be unaccounted for, that may well be cause for concern, so check nothing is being indexed multiple times.
Just a couple of scenarios, but I hope it helps.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there such tool?
Morning Moz! Is there such tool, to view if your site has Google Analytics script by page? (I mean page by page, so to crawl an entire site) It's a 500+ page website, so i'm trying to avoid viewing source of each individual page. Thank you!
Reporting & Analytics | | Whittie0 -
Rel=Canonical vs. No Index
Ok, this is a long winded one. We're going to spell out what we've seen, then give a few questions to answer below, so please bear with us! We have websites with products listed on them and are looking for guidance on whether to use rel=canonical or some version of No Index for our filtered product listing pages. We work with a couple different website providers and have seen both strategies used. Right now, one of our web providers uses No Index, No Follow tags and Moz alerted us to the high frequency of these tags. We want to make sure our internal linking structure is sound and we are worried that blocking these filtered pages is keeping our product pages from being as relevant as they could be. We've seen recommendations to use No Index, Follow tags instead, but our other web provider uses a different method altogether. Another vendor uses a rel=canonical strategy which we've also seen when researching Nike and Amazon's sites. Because these are industry leading sites, we're wondering if we should get rid of the No Index tags completely and switch to the canonical strategy for our internal links. On that same provider's sites, we've found rel=canonical tags used after the first page of our product listings, and we've seen recommendations to use rel=prev and rel=next instead. With all that being said, we have three questions: 1)Which strategy (rel=canonical vs. No Index) do you recommend as being optimal for website crawlers and boosting our site relevance? 2)If we should be using some version of No Index, should we use Follow or No Follow? 2)Depending on the product, we have multiple pages of products for each category. Should we use rel=prev & rel=next instead of rel=canonical among the pages after page one? Thanks in advance!
Reporting & Analytics | | Leithmarketing0 -
Google Analytics and Webmaster Tools Setup for Agencies
Hi, As agencies, what are people finding to be the best practices for allowing multiple members of the agency's team to access client WMT and GA data? Have a generic "[email protected]" account that's used for the shares, that anyone in the agency can use as needed (limited, of course, not admin). Have the individual person at the company use their company email for the share for each particular client? [email protected]. Yet what happens when we need someone else to check the GA or WMT data? Any advice is much appreciated.
Reporting & Analytics | | Titan552
Thank you!0 -
What is the best tool to look at Internal Links?
My site has on average 220 internal links a page largely due to how the site is built ie every page shows every link via the menu system, which isnt great, but from what i have read there isnt much i can do about it. However, this week 3 pages jumped to over 400 links and the home page over 500 internal links, GWT will only show me 200 of them, Is there a tool that i can use that will list all my internal links, so i can find out why this has happened? Thanks Ash
Reporting & Analytics | | AshShep10 -
Duplicate page content
I'm seeing duplicate page content for tagged URLs. For example:
Reporting & Analytics | | DolbySEO
http://www.dolby.com/us/en/about-us/careers/landing.html
http://www.dolby.com/us/en/about-us/careers/landing.html?onlnk=al-sc as well as PPC campaigns. We tag certain landing pages purposefully in order to understand that traffic comes from these pages, since we use Google Analytics and don't have the abiility to see clickpaths in the package we have. Is there a way to set parameters for crawling to exclude certain pages or tagged content, such as those set up for PPC campaigns?0 -
Fb, twitter, etc stats tool?
I once saw a tool (from Distilled, I think?) where you would input your blog post or page URL, and it would give you a table showing how many Facebook likes, Tweets, etc that page has gotten. Does anyone have the URL for this or a similar tool?
Reporting & Analytics | | AdamThompson0 -
X2 Google Analytics affect page rank ?
Hi there, If you had 2 Google Analytics Accounts one to the main site and another to the blog, could this affect the page position in Google? We've suddenly noticed a drop in our KWs and it was shortly after we added another Google Analytics Account. The blog has 68% Bounce rate and the main site has always been about 48%. Any help would be much appreciated. Many thanks Paul
Reporting & Analytics | | webdesigncwd0 -
List all URL's indexed by google
Hi all i need a list of all urls google has indexed from my site i want this in excel format or csv how do i go about getting this thanks in advance
Reporting & Analytics | | Will_Craig0