Sitemaps and Indexed Pages
-
Hi guys,
I created an XML sitemap and submitted it for my client last month.
Now the developer of the site has also been messing around with a few things.
I've noticed on my Moz site crawl that indexed pages have dropped significantly.
Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed?
Thanks
David.
-
Sorry - I missed the part about you looking specifically at the Moz crawler. While useful, it's a stand-in for what will actually be used for rankings - namely the actual crawls by the search engine crawlers themselves. I'd be looking right to the source for that info if you're concerned there's an issue, rather than trusting just Mozbot. You can find the SE crawlers data in Google Search Console and Bing Webmaster Tools. Look for trends and patterns there, especially around the sitemap report.
The challenge to a Screaming Frog-rendered sitemap is that it can only find what's linked. If the site has orphaned pages or an ineffective internal linking scheme, a crawl could easily miss pages. It's certainly better than no sitemap, but a map generated by the site's technology itself (usually the database) is safer.
P.
-
Thanks Paul,
Yes there has been a big clean up of pages. There were over 80,000 to begin with. I managed to get that down to about 14k but then last month MOZ bot only crawled about 4,000 pages.
I was just a bit worried that the sitemap generated by Screaming Frog was incorrect and therefore that was the reason for the drop.
I was referring mainly to the MOZ site crawl. I guess I was worried that the MOZ bot only followed the sitemap!
There were loads of filter URL's and all sorts going on so it's a bit of a spiders web!
-
No - submitting a sitemap won't reduce the crawl of a site. The search engines will crawl the sitemap and add these pages to the index if they consider them worthy. But they'll still also crawl any other links/pages they can find in other ways and index those as well if they consider them worthy.
Note though - having the number of indexed pages drop is not necessarily a bad thing. If removing a large number of worthless/duplicate/canonicalised/no-indexed pages cleans up the site, that will also be reflected in fewer crawled pages - an indication that quality improvement work was effective.
That help?
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz Crawl: Can't check page optimization error https
Help needed, when I try to do a page optimization check i get the following error : The URL you entered does not appear to be returning a page successfully. Please make sure that you've entered the URL of valid, working page. But i can do a site crawl, what should be the problem? Checked with frog seo spider and add no problem, robots.txt its also clean. Anyone knows what can be wrong? Thanks
API | | Luis-Pereira0 -
March 2nd Mozscape Index Update is Live!
We are excited to announce that our March 2<sup>nd</sup> Index Update is complete and it is looking great! We grew the number of subdomains and root domains indexed, and our correlations are looking solid across the board. Run, don’t walk, to your nearest computer and check out the sweet new data! Here is a look at the finer details: 141,626,596,068 (141 billion) URLs 1,685,594,701 (1 billion) subdomains 193,444,117 (193 million) root domains 1,124,641,982,250 (1.1 Trillion) links Followed vs nofollowed links 3.09% of all links found were nofollowed 62.41% of nofollowed links are internal 37.59% are external Rel canonical: 27.46% of all pages employ the rel=canonical tag The average page has 92 links on it 74 internal links on average 18 external links on average Thanks again! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand:https://mza.seotoolninja.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson7 -
January’s Mozscape Index Release Date has Been Pushed Back to Jan. 29th
With a new year brings new challenges. Unfortunately for all of us, one of those challenges manifested itself as a hardware issue within one of the Mozscape disc drives. Our team’s attempts to recover the data from the faulty drive only lead to finding corrupted files within the Index. Due to this issue we had to push the January Mozscape Index release date back to the 29<sup>th</sup>. This is not at all how we anticipated starting 2016, however hardware failures like this are an occasional reality and are also not something we see being a repeated hurdle moving forward. Our Big Data team has the new index processing and everything is looking great for the January 29<sup>th</sup> update. We never enjoy delivering bad news to our faithful community and are doing everything in our power to lessen these occurrences. Reach out with any questions or concerns.
API | | IanWatson2 -
10/14 Mozscape Index Update Details
Howdy gang, As you might have seen, we've finally been able to update the Mozscape index after many challenging technical problems in the last 40 days. However, this index has some unique qualities (most of them not ideal) that I should describe. First, this index still contains data crawled up to 100 days ago. We try to make sure that what we've crawled recently is stuff that we believe has been updated/changed, but there may be sites and pages that have changed significantly in that period that we didn't update (due to issues I've described here previously with our crawlers & schedulers). Second, many PA/DA and other metric scores will look very similar to the last index because we lost and had problems with some metrics in processing (and believe that much of what we calculated may have been erroneous). We're using metrics from the prior index (which had good correlations with Google, etc) until we can feel confident that the new ones we're calculating are correct. That should be finished by the next index, which, also, should be out much faster than this one (more on that below). Long story short on this one - if your link counts went up and you're seeing much better/new links pointing to you, but DA/PA remain unchanged, don't panic - that's due to problems on our end with calculations and will be remedied in the next index. Third - the good news is that we've found and fixed a vast array of issues (many of them hiding behind false problems we thought we had), and we now believe we'll be able to ship the next index with greater quality, greater speed, and better coverage. One thing we're now doing is taking every URL we've ever seen in Google's SERPs (via all our rank tracking, SERPscape, the corpus for the upcoming KW Explorer product, etc) and prioritizing them in Mozscape's crawl, so we expect to be matching what Google sees a bit more closely in future indices. My apologies for the delay in getting this post up - I was on a plane to London for Searchlove - should have got it up before I left.
API | | randfish4 -
In lue of the canceled Moz Index update
Hey Moz, Overall we love your product and are using it daily to help us grow, part of that has been to rely on the Moz Index for DA and PA as well as places where we are doing positive linking through genuine partnerships and reviews of clients. We were really excited to see any the results for this month as we have been partner linked from lots of high reputation sites and google seems to agree as our rankings are moving up weekly. The question from our marketing team is, since a significant part of Moz will not be available to us this month, will there be any compensation handed out to the paying community. PS: I am an engineer and I know how you have probably lost a very large set of data which cant simply be re-crawled over night but Moz Pro is not a cheap product and we do expect it to work. Source: https://mza.seotoolninja.com/products/api/updates Kind Regards.
API | | SundownerRV0 -
3 result limit to Top Pages API call
I am using the MOZ API to make calls for the top pages for a particular URL. However, when I pass in any limit value greater than 3 the API only returns 3 results. I have even tried to put in URLs like 'www.moz.com' and still only 3 results. Sample call to the API below: http://lsapi.seomoz.com/linkscape/top-pages/www.moz.com?AccessID=member-xxxxxxxxx&Expires=1419020831&Signature=xxxxxxxxx&Cols=2052&Offset=0&Limit=50
API | | solodev0 -
On page reports
Hi everyone I have just been going through the online page I see that I have quite a few words that have an F rating I was wondering if I have all the keywords with an A ranking would that improve our Moz rating? Also of the below elements can anyone tell which, if any are more important than the others? Title URL Meta Desc H1 H2-4 Body B / Strong IMG ALT
API | | Hardley1110