Google indexing less url's then containded in my sitemap.xml
-
My sitemap.xml contains 3821 urls but Google (webmaster tools) indexes only 1544 urls. What may be the cause? There is no technical problem. Why does Google index less URLs then contained in my sitemap.xml?
-
Thank you for helping
-
Unless you have a SEO actively reviewing your site, it is quite normal for Google to index less pages then are offered in your sitemap.
How exactly was your sitemap created? Did you go by hand through your site's 3281 pages and add them to a sitemap? Or more likely, did you use a tool to create the sitemap? If you used a tool, how much knowledge do you have regarding how this tool works or its settings?
Just a few examples of URLs which may be included in your sitemap that Google would likely not index:
-
Your home page and other pages may have multiple URLs which lead to the same page. For example: www.mysite.com and www.mysite.com/index.html may be two URLs for the same page. Google will likely only index one of them.
-
You may have links to various URLs which contain parameters which Google will reduce to a single URL. For example: www.mysite.com/product_id=308&sort=asc&color=black, and another URL www.mysite.com/product_id=308&sort=desc&color=black. Both URLs lead to the same content sorted differently.
-
You may have duplicate content on your site. For example, you can sell chairs and list the same chair under multiple paths such as /furniture/wood/chair123 and /furniture/dining-room/chair123. Google will recognize these two pages are the same content presented under multiple URLs.
-
You may have submitted pages to your sitemap which are blocked via robots.txt or the "noindex" tag or are canonicalized to another page.
In order to better understand the root issue you need to examine a list of all URLs in your sitemap and compare that to a list of all indexed URLs. Determine which URLs Google has not indexed and research the reason for each one independently.
-
-
Are they index worthy?
Having them on your sitemap does not mean google wants them in its index
-
He just said it. Is this a new domain? Im in the same boat as you for some of my domains.
-
Yes, I understand this. But In this situation Google first indexes all the URL's within my sitemap.xml uploaded in Google Webmaster tools. Now Google indexes less URL's, only 50%. What can be the cause if there are no technical problems?
-
Hi!
Google will only spend 'so much time' on any new domain. The more traffic and links and page authority you get, the more time Google will dedicate to crawling your website. You should also make sure that the site is not slow, as this will reduce the crawling speed even more! See Google page speed for tips on speeding up the load time of your site
Good Luck,
Sven Witteveen
Expand Online
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there an easy way to hide one of your URL's on google search?, rather than redirecting?
We don't want to redirect to a different page, as some people still use it, we just don't want it to appear in search
Technical SEO | | TheIDCo0 -
My website is currently failing Google's mobile friendly test. What are my options?
What can I tell my developer so I pass this test? What will they need to develop A web mockup? Is there an easy code to implement?
Technical SEO | | pmull0 -
Clarification on indexation of XML sitemaps within Webmaster Tools
Hi Mozzers, I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending". On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data: 13/08 - Submitted=0 Indexed=0
Technical SEO | | Silkstream
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498 Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th? Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps? Thanks in advance for any help.0 -
URL gets cut off in Google
Hi everybody, I got a question concerning my website URLs. It's a large WordPress website and we've got a lot of categorised pages ('parent' / 'child'). Now when I search for a specific page I only get to see the 'parent' name in the URL. The page which I am looking for isn't visible. Only a small arrow which shows me 2 options (in cache and compare). The URLs are not too long. Does anybody know why this happens, and how I can solve it? I added a image for reference. (Where /partners/ is the parent page and /partners/aruba/ isn't visible) Thank you very much. LSsT1Ua
Technical SEO | | SecureLink0 -
Why can't I redirect 302 errors to 301's?
I've been advised by IT that due to the structure of our website (they don't use sub-folders) it's not possible to change 302's to 301's. Is this correct, or am I being fobbed off?
Technical SEO | | lindsaytuerena0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
What's the best URL Structure if my company is in multiple locations or cities?
I have read numerous intelligent, well informed responses to this question but have yet to hear a definitive answer from an authority. Here's the situation. Let's say I have a company who's URL is www.awesomecompany.com who provides one service called 'Awesome Service' This company has 20 franchises in the 20 largest US cities. They want a uniform online presence, meaning they want their design to remain consistent across all 20 domains. My question is this; what's the best domain or url structure for these 20 sites? Subdomain - dallas.awesomecompany.co Unique URL - www.dallasawesomecompany.com Directory - www.awesomecompany.com/dallas/ Here's my thoughts on this question but I'm really hoping someone b*tch slaps me and tells me I'm wrong: Of these three potential solutions these are how I would rank them and why: Subdomains Pros: Allows me to build an entire site so if my local site grows to 50+ pages, it's still easy to navigate Allows me to brand root domain and leverage brand trust of root domain (let's say the franchise is starbucks.com for instance) Cons: This subdomain is basically a brand new url in google's eyes and any link building will not benefit root domain. Directory Pros Fully leverages the root domain branding and fully allows for further branding If the domain is an authority site, ranking for sub pages will be achieved much quicker Cons While this is a great solution if you just want a simple map listing and contact info page for each of your 20 locations, what if each location want's their own "about us" page and their own "Awesome Service" page optimized for their respective City (i.e. Awesome Service in Dallas)? The Navigation and potentially the URL is going to start to get really confusing and cumbersome for the end user. Think about it, which is preferable?: dallas.awesomcompany.com/awesome-service/ www.awesomecompany.com/dallas/awesome-service (especially when www.awesomecompany.com/awesome-service/ already exists Unique URL Pros Potentially quicker rankings achieved than a subdomain if it's an exact match domain name (i.e. dallasawesomeservice.com) Cons Does not leverage the www.awesomecompany.com brand Could look like an imposter It is literally a brand new domain in Google's eyes so all SEO efforts would start from scratch Obviously what goes without saying is that all of these domains would need to have unique content on them to avoid duplicate content penalties. I'm very curious to hear what you all have to say.
Technical SEO | | BrianJGomez0 -
Site being indexed by Google before it has launched
We are currently coming towards the end of a site migration, and are at the final stage of testing redirects etc. However, to our horror we've just discovered Google has started indexing the new site. Any ideas on how this could have happened? I have most recently asked for robots.txt to exclude anything with a certain parameter in URL. Is there a chance this, wrongly implemented, could have caused this?
Technical SEO | | Sayers0