Old pages still in index
-
Hi Guys,
I've been working on a E-commerce site for a while now. Let me sum it up :
- February new site is launched
- Due to lack of resources we started 301's of old url's in March
- Added rel=canonical end of May because of huge index numbers (developers forgot!!)
- Added noindex and robots.txt on at least 1000 urls.
- Index numbers went down from 105.000 tot 55.000 for now, see screenshot (actual number in sitemap is 13.000)
Now when i do site:domain.com there are still old url's in the index while there is a 301 on the url since March!
I know this can take a while but I wonder how I can speed this up or am doing something wrong. Hope anyone can help because I simply don't know how the old url's can still be in the index.
-
Hi Dan,
Thanks for the answer!
Indexation is already back to 42.000 so slowly going back to normal
And thanks for the last tip, that's totally right. I just discovered that several pages had duplicate url's generated so by continually monitoring we'll fix it !
-
Hi There
To noindex pages there are a few methods;
-
use a meta noindex without robots.txt - I think that is why some may not be removed. The robots.txt block crawling so they can not see the noindex.
-
use a 301 redirect - this will eventually kill off the old pages, but it can definitely take a while.
-
canonical it to another page. and as Chris says, don't block the page or add extra directives. If you canonical the page (correctly), I find it usually drops out of the index fairly quickly after being crawled.
-
use the URL removal tool in webmaster tools + robots.txt or 404. So if you 404 a page or block it with robots.txt you can then go into webmaster tools and do a URL removal. This is NOT recommended though in most normal cases, as Google prefers this be for "emergencies".
The only method that removes pages within a day or two guaranteed is the URL removal tool.
I would also examine your site since it is new, for something that is causing additional pages to be generated and indexed. I see this a lot with ecommerce sites where they have lots of pagination, facets, sorting, etc and those can generate lots of other pages which get indexed.
Again, as Chris says, you want to be careful to not mix signals. Hope this all helps!
-Dan
-
-
Hi Chris,
Thanks for your answer.
I'm either using a 301 or noindex, not both of course.
Still have to check the server logs, thanks for that!
Another weird thing. While the old url is still in the index, when i check the cache date it's a week old. That's what i don't get. Cache date is a week old but Google still has the old url in the index.
-
It can take months for pages to fall out of Google's index have you looked at your log files to verify that googlebot is crawling those pages?. Things to keep in mind:
- If you 301 a page, the rel=canonical on that page will not be seen by the bot (no biggie in your case)
- If you 301 a page, a meta noindex will not be seen by the bot
- It is suggested not to use the robots.txt to no index a page that is being 301 redirected--as the redirect may not be seen by Google.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my home page ranking much higher than my collection page?
Hi everyone, Why is my client's home page ranking high for a certain keyword phrase rather than a collection page I have which is well optimised for this keyword? The collection page is on the 10th SERPs page. I did see there were keywords used in the footer of page and the keyword was also used in some intro text on the home page so I removed the keyword from these two places nearly 2 weeks ago and requested google to reindex both the collection page and home page and I've not seen any improvement of the collection page's ranking in SERPs. I also changed the meta description and meta title as the ctr was poor but there wasn''t that many impressions either. It is a competitive keyword organically so maybe the collection page's authority is just not good enough compared to the competitors hence why they are choosing the home page as it has higher page authority however this still is not helpful to searchers who land on home page. Does anyone have any ideas of what else I can do to get google to rank the ocllection page higher for the keyword instead of home page?
Intermediate & Advanced SEO | | TZ19820 -
How many links to the same page can there be for each page?
I need to know if I can add more than 2 equal links on the same page, for example 1 link in the header, another in the body and one in the footer
Intermediate & Advanced SEO | | Jorgesep0 -
Best way to link to 1000 city landing pages from index page in a way that google follows/crawls these links (without building country pages)?
Currently we have direct links to the top 100 country and city landing pages on our index page of the root domain.
Intermediate & Advanced SEO | | lcourse
I would like to add in the index page for each country a link "more cities" which then loads dynamically (without reloading the page and without redirecting to another page) a list with links to all cities in this country.
I do not want to dillute "link juice" to my top 100 country and city landing pages on the index page.
I would still like google to be able to crawl and follow these links to cities that I load dynamically later. In this particular case typical site hiearchy of country pages with links to all cities is not an option. Any recommendations on how best to implement?0 -
Landing pages, are my pages competing?
If I have identified a keyword which generates income and when searched in google my homepage comes up ranked second, should I still create a landing page based on that keyword or will it compete with my homepage and cause it to rank lower?
Intermediate & Advanced SEO | | The_Great_Projects0 -
Should my back links go to home page or internal pages
Right now we rank on page 2 for many KWs, so should i now focus my attention on getting links to my home page to build domain authority or continue to direct links to the internal pages for specific KWs? I am about to write some articles for several good ranking sites and want to know whether to link my company name (same as domain name) or KW to the home page or use individual KWs to the internal pages - I am only allowed one link per article to my site. Thanks Ash
Intermediate & Advanced SEO | | AshShep10 -
Indexing a several millions pages new website
Hello everyone, I am currently working for a huge classified website who will be released in France in September 2013. The website will have up to 10 millions pages. I know the indexing of a website of such size should be done step by step and not in only one time to avoid a long sandbox risk and to have more control about it. Do you guys have any recommandations or good practices for such a task ? Maybe some personal experience you might have had ? The website will cover about 300 jobs : In all region (= 300 * 22 pages) In all departments (= 300 * 101 pages) In all cities (= 300 * 37 000 pages) Do you think it would be wiser to index couple of jobs by couple of jobs (for instance 10 jobs every week) or to index with levels of pages (for exemple, 1st step with jobs in region, 2nd step with jobs in departements, etc.) ? More generally speaking, how would you do in order to avoid penalties from Google and to index the whole site as fast as possible ? One more specification : we'll rely on a (big ?) press followup and on a linking job that still has to be determined yet. Thanks for your help ! Best Regards, Raphael
Intermediate & Advanced SEO | | Pureshore0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Category Pages - Canonical, Robots.txt, Changing Page Attributes
A site has category pages as such: www.domain.com/category.html, www.domain.com/category-page2.html, etc... This is producing duplicate meta descriptions (page titles have page numbers in them so they are not duplicate). Below are the options that we've been thinking about: a. Keep meta descriptions the same except for adding a page number (this would keep internal juice flowing to products that are listed on subsequent pages). All pages have unique product listings. b. Use canonical tags on subsequent pages and point them back to the main category page. c. Robots.txt on subsequent pages. d. ? Options b and c will orphan or french fry some of our product pages. Any help on this would be much appreciated. Thank you.
Intermediate & Advanced SEO | | Troyville0