Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?

corp0803

I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's for each page.

To avoid duplicates, all of the existing pages were added to the robots file. That said, it has now been over a week - I know Google has recrawled the site - and when I search for term X, it is stil the old page that is ranking, with the new one nowhere to be seen. I'm assuming it's a cached version, but why are so many of the old pages still appearing in the index?

Furthermore, all "tags" pages (it's a Q&A site, like this one) were also added to the robots a few months ago, yet I think they are all still appearing in the index. Anyone got any ideas about why this is happening, and how I can get my new pages indexed?

KeriMorgret

Is there a way for the search engines to find the new content? If you're blocking the old content from being crawled, even if there are redirects, it makes it that much harder for the new URLs to be found. Also, do you have a sitemap with the new URLs?

AlanMosley

SE's dont take notive first crawl, they need to see resulst a few times before they take notive, this i think is make sure thet you mean what the see, and not just a mistake.

I would not exclude any pages with robots, this is a very crude way of doing things. if the page does not exists then it will drop out of index, if it does exist, then you would lose link juice thought all the links that point to those pages. If you have to no-index a page, use the meta tag "no-index,follow" so that links are still flollowed back out of the page.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Index, follow on a paginated page with a different rel=canonical URL

How long to re-index a page after being blocked

Robots.txt - blocking JavaScript and CSS, best practice for Magento

301'd an important, ranking page to the wrong new page, any recourse?

Block in robots.txt instead of using canonical?

How long until Sitemap pages index

Reciprocal Links and nofollow/noindex/robots.txt

Blocking Dynamic URLs with Robots.txt