3,511 Pages Indexed and 3,331 Pages Blocked by Robots

PeaSoupDigital

Morning,

So I checked our site's index status on WMT, and I'm being told that Google is indexing 3,511 pages and the robots are blocking 3,331. This seems slightly odd as we're only disallowing 24 pages on the robots.txt file. In light of this, I have the following queries:

Do these figures mean that Google is indexing 3,511 pages and blocking 3,331 other pages? Or does it mean that it's blocking 3,331 pages of the 3,511 indexed?
As there are only 24 URLs being disallowed on robots.text, why are 3,331 pages being blocked? Will these be variations of the URLs we've submitted?
Currently, we don't have a sitemap. I know, I know, it's pretty unforgivable but the old one didn't really work and the developers are working on the new one. Once submitted, will this help?
I think I know the answer to this, but is there any way to ascertain which pages are being blocked?

Thanks in advance!

Lewis

PeaSoupDigital

Hi,

No more links than a standard e-commerce site should have...

I'm chasing the sitemap as we speak.

Cheers,

MonicaOConnor

The blocked URLs are probably no follow links throughout the site. Do you have a lot of links pointing outward from pages?

Google is indexing 3511 pages, of which 3331 are blocked by Robots. I would check some of the internal/external links on those disallowed pages. I don't see how it could come up to 3331 blocked pages, but it couldn't hurt to start there.

Definitely get a sitemap submitted asap. It will help for sure.

Whittie

Excuse the short reply.

Add sitemap to your robots.txt - And submit it to Google WMT.

Just use a free one if you're in the middle of developing?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

3,511 Pages Indexed and 3,331 Pages Blocked by Robots

Browse Questions

Explore more categories

Related Questions

Does a no-indexed parent page impact its child pages?

Blog page won't get indexed

Is it easier to rank high with a front page than a landing page?

OK to block /js/ folder using robots.txt?

Blocking robots.txt

Client accidently blocked entire site with robots.txt for a week

Existing Pages in Google Index and Changing URLs

On Page 301 redirect for html pages