Google indexing less url's then containded in my sitemap.xml
-
My sitemap.xml contains 3821 urls but Google (webmaster tools) indexes only 1544 urls. What may be the cause? There is no technical problem. Why does Google index less URLs then contained in my sitemap.xml?
-
Thank you for helping
-
Unless you have a SEO actively reviewing your site, it is quite normal for Google to index less pages then are offered in your sitemap.
How exactly was your sitemap created? Did you go by hand through your site's 3281 pages and add them to a sitemap? Or more likely, did you use a tool to create the sitemap? If you used a tool, how much knowledge do you have regarding how this tool works or its settings?
Just a few examples of URLs which may be included in your sitemap that Google would likely not index:
-
Your home page and other pages may have multiple URLs which lead to the same page. For example: www.mysite.com and www.mysite.com/index.html may be two URLs for the same page. Google will likely only index one of them.
-
You may have links to various URLs which contain parameters which Google will reduce to a single URL. For example: www.mysite.com/product_id=308&sort=asc&color=black, and another URL www.mysite.com/product_id=308&sort=desc&color=black. Both URLs lead to the same content sorted differently.
-
You may have duplicate content on your site. For example, you can sell chairs and list the same chair under multiple paths such as /furniture/wood/chair123 and /furniture/dining-room/chair123. Google will recognize these two pages are the same content presented under multiple URLs.
-
You may have submitted pages to your sitemap which are blocked via robots.txt or the "noindex" tag or are canonicalized to another page.
In order to better understand the root issue you need to examine a list of all URLs in your sitemap and compare that to a list of all indexed URLs. Determine which URLs Google has not indexed and research the reason for each one independently.
-
-
Are they index worthy?
Having them on your sitemap does not mean google wants them in its index
-
He just said it. Is this a new domain? Im in the same boat as you for some of my domains.
-
Yes, I understand this. But
In this situation Google first indexes all the URL's within my sitemap.xml uploaded in Google Webmaster tools. Now Google indexes less URL's, only 50%. What can be the cause if there are no technical problems?
-
Hi!
Google will only spend 'so much time' on any new domain. The more traffic and links and page authority you get, the more time Google will dedicate to crawling your website. You should also make sure that the site is not slow, as this will reduce the crawling speed even more! See Google page speed for tips on speeding up the load time of your site
Good Luck,
Sven Witteveen
Expand Online
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to check if an individual page is indexed by Google?
So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites
Technical SEO | | linklander0 -
What was the Google 'update' on 31st March?
Hi all. I looked back and saw that there was an update shown in 'Search Analytics' in Webmaster Tools a few weeks before the Mobile algorithm update. Not been able to find any mention of it and what it did so thought I'd check in here. ps. Also, this is a 90 day stretch and shows that our rankings have taken a hit since the mobile algorithm update. Interesting stuff (see image below) 4rJMU9e.jpg?1
Technical SEO | | RobFD0 -
WWW to Non-WWW = Less Indexing?
Hi all, About 10 months ago we changed all of our urls to redirect to a non-www vs. the www because it was creating both iterations and therefore duplicate content. We didn't change anything in Webmaster Tools and it looks like our indexing went down significantly. Is this a problem? How can I fix it? *It looks like GWT also went through an update at that time? klxJ7gl
Technical SEO | | Becky_Converge0 -
Pages to be indexed in Google
Hi, We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages. Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone. My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate. If we need to remove what needs to be done? Robots block or Noindex/Nofollow Regards
Technical SEO | | mtthompsons0 -
My sitemap in Google is coming back with an error
I submitted my xml sitemap to Google Webmaster tools. It is giving an error, not found. 404 Error. But I can't figure out why my site map is signaling a 404. Why? 😞
Technical SEO | | cschwartzel0 -
'No Follow' and 'Do Follow' links when using WordPress plugins
Hi all I hope someone can help me out with the following question in regards to 'no follow' and 'do follow' links in combination with WordPress plugins. Some plugins that deal with links i.e. link masking or SEO plugins do give you the option to 'not follow' links. Can someone speak from experience that this does actually work?? It's really quite stupid, but only occurred to me that when using the FireFox add on 'NoDoFollow' as well as looking at the SEOmoz link profile of course, 95% of my links are actually marked as FOLLOW, while the opposite should be the case. For example I mark about 90% of outgoing links as no follow within a link masking plugin. Well, why would WordPress plugins give you the option to mark links as no follow in the first place when they do in fact appear as follow for search engines and SEOmoz? Is this a WordPress thing or whatnot? Maybe they are in fact no follow, and the information supplied by SEO tools comes from the basic HTML structure analysis. I don't know... This really got me worried. Hope someone can shed a light. All the best and many thanks for your answers!
Technical SEO | | Hermski0 -
What's the best canonicalization method?
Hi there - is there a canonicalization method that is better than others? Our developers have used the
Technical SEO | | GBC0 -
Crawl Tool Producing Random URL's
For some reason SEOmoz's crawl tool is returning duplicate content URL's that don't exist on my website. It is returning pages like "mydomain.com/pages/pages/pages/pages/pages/pricing" Nothing like that exists as a URL on my website. Has anyone experienced something similar to this, know what's causing it, or know how I can fix it?
Technical SEO | | MyNet0