Moz Crawl shows over 100 times more pages than my site has?
-
The latest crawl stats are attached. My site has just over 300 pages?
Wondering what I have done wrong?
-
total pages is higher you are right Keri but still only 581
-
I believe this image looks at what's indexed that's a subset of your sitemap that you submitted. You may want to look at Google Index -> Index Status in GWT to see what it shows there.
-
latest Moz crawl
-
latest webmaster tools crawl
-
I will definetly be paying attention to those numbers Keri. Webmaster tools is showing the right number of pages (something over 300 with 90% of those indexed)
-
It's not going to be a penalty, but it'll be good to have a bit less of a load on your server (bots no longer crawling thousands of pages) and just have your real pages in the index.
Places to look for interesting changes in site metrics would be your organic traffic in analytics and taking a look at your Google Webmaster Tools account to see your impressions, pages crawled, etc.
-
Thanks Keri, I will update asap.
could you let me know how big an issue would this be? (When you have the time of course;))
-
You're welcome! I may have opened a can of worms, however. That sitemap is generated by an automated tool (based on the footer at the bottom), so somehow it's finding that page 28 as well.
You may also want to ask the developer if you should be indexing the categories in the blog archives. There are resources on Moz about the best way to set that up in Wordpress, but I don't have them at my fingertips at the moment (I have a snuggly baby sleeping on my lap instead that's slowing me down a tad).
To answer your next question, after you figure out where the page 28 is being linked from and cure that, yes, you can do a one-time crawl from Research Tools. It won't overwrite your campaign info, but you can at least see if Moz is seeing thousands of pages or just a few hundred to see if stuff was fixed. Again, happy to provide more detail if/when you need it (and others will likely jump in with help on the thread, too).
I'd love to also see a little update a few weeks down the line of any changes you've noticed on your site metrics after getting this fixed.
-
You rock:)
-
And I found it. The sitemap at http://www.nineclouds.ca/sitemap includes a page /28, which is where the crawlers are finding the non-existent pages.
-
If you look at http://www.nineclouds.ca/blog/page/23, you'll see that there's a double arrow in the pagination at the right that goes to page 24, even though the last page is page 21. Google somehow has found the pages greater than 21 (which I'm not sure how they found), and once they found one of those, they keep seeing the link there with the double arrows to go to another page. Same happened with Rogerbot. I'm not sure where the bad originating link is (what legit page on your site is linking to something over page 21), but that's the loop that's happening and causing a ton of pages to be indexed. Get rid of those, and you'll also get rid of most of your errors.
-
Not shy about that at all thanks Keri.
any help you can provide is greatly appreciated.
-
Hi Bill,
Using my admin powers, I took a peek at your account. I'm still trying to figure out where it's coming from, but you have thousands of empty pages of your blog indexed. I'll dig around a little more and see if I can figure out what's up.
If you're comfortable with sharing your URL here in a public forum, other people can come take a look too. Otherwise, I'm happy to send you a private message with part of what's up and give your developer a place to start looking.
-
Thanks Keri. I am the owner of the site not the programmer so I am looking up the terms you are using as I write this response. If I am using pagination is there a way for the moz not to allow for this? If I understand your question about the calendar correctly I do have one as part of my blog that dates each post? Can I get the bot to not recognize this calendar?
-
My first guess would be parameters or something are being crawled. Do you have pagination? Sorting ascending and descending? A calendar that's getting crawled through the year 2525?
Your next step would be to look into what those duplicate pages are and see if something is amiss that's generating a ton of URLs.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I create a segment that shows me all pages using a certain keyword? But nothing that doesn't have that keyword?
There must be an easy answer to this, but I can't seem to find it. All I want to do is create a segment in Google Analytics that shows all pages and search strings with "orthopaedics" in the title, with pageviews, uniques etc. If I simply navigate to "All Pages" in Google Analytics and then click Advanced Filters and do an Include Page Contains "orthopaedics" it works just fine. (See attached Screen Shot) But when I try to recreate this as a segment, it pulls in all other pages the users visited before arriving on the orthopaedics page I want to include, which I don't want. I can manually exclude each URL I don't want, but this is tedious and I feel there must be a simpler method I'm just missing. At the end of the day, I'm trying to create a list of every page and dynamically created query string that includes the word "orthopaedics" to say doctor X, your orthopaedics section generated X views, and here's a list of the pages. Mm6YTKa
Reporting & Analytics | | Patrick_at_Nebraska_Medicine0 -
Avg Page Load Time (sec) Comppared to site average - what does it mean?
Hi All, In google analytic In Site Speed -> Page Timings we have two columns a) Page Views & b) Avg Load Time (sec) compared to site average. Now in "b" column I am able to below % one in green and another in brown so what does it mean? Can anyone please explain me? Image attached Thanks! bNbBA
Reporting & Analytics | | amu1230 -
Google Analytic - Avg load Time - Page Timing Sometimes Graph goes suddenly up why?
Hello All, I have attached the avg load time screenshot for my ecommerce site for 1 month. Screenshot is for Mobile site. If you check the graph then only one day graph gone very high. Now again I have attached that particular day graph too. My query is why graph gone high 1) Is it my site not performed well on particular one day on particular device for single visitor? because for desktop and tablet graph showing normal 2) so here if site restart or down then it can be a problem for desktop, tablet too right? Can anyone give me any clue? Z4hX1 IanuP
Reporting & Analytics | | pragnesh96390 -
Google Analytics - Next Page Path is the Same URL?
Hey Everyone, I have a Google analytics question. I'm looking through a client's site and when I look at the next page path, I get the same URL as the next path. For example, on the homepage, the next page path I get is the homepage again? This happens for all URL's, is this an implementation error? Is there a way to fix this? Thanks!
Reporting & Analytics | | EvansHunt0 -
Multiple Site Errors Due to Forum
We are currently working with a website that is receiving multiple errors through their vbulletin forum. These errors include: duplicate meta descriptions duplicate title tags duplicates title tags and meta descriptions from the archives as well The forums do not drive a lot of traffic but people are active on there. Would you recommend no-indexing the entire forum plus the archives or is there a better solution for that?
Reporting & Analytics | | axzm0 -
Homepage on page 2 for site:domain
Hi all, today I noticed that our homepage is located on page 2 if you do the site:domain query. As far as I know, the site:domain results mirror the importance in the eyes of Google. Some time ago, our homepage was the first result. I have to say that we do not often have changing elements or new content on the homepage, it is more like a static page. But still the most linked to page on the domain... What conclusion can I come to? Is our homepage of lower importance to Google than some time ago? Is it a problem for SEO? As we backed down our advertisments, the traffic from branded keywords fell the last months - could this be an explanation? And, most important: do I have to worry? (Besides, the SEO-traffic is fine and growing..)
Reporting & Analytics | | accessKellyOCG0 -
What's the best enterprise analytic solution for a website with 100+ Million Visits/Month
Hi Guys, I'm looking for an enterprise solution for my companies website that currently gets 100+ Million visits a month? We use the free version of Google Analytic but the sampling levels we get are just too small. We have the budget to get something substantial -- the question is what solution should we go with? Thanks, Nicolas
Reporting & Analytics | | Nicolas_Seattle0