Best way to permanently remove URLs from the Google index?
-
We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt.
I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt).
What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login.
What is the next best solution?
-
I agree with Paul, The Google is re indexing the pages because you have few linking pointing back to these sub domains. The best idea us to restrict Google crawler by using no-index , no-follow tag and remove the instruction available in the robots.txt...
This way Google will neither crawl nor follow the activity on the page and it will get permanently remove from Google Index.
-
Yup - Chris has the solution. The robots.txt disallow directive simply instructs the crawler not to crawl, it doesn't have any instructions regarding removing URLs from the index. I'm betting there are other pages linking in to the subdomains that the bots are following to find and index as the URL Removal requests are expiring.
Do note though that when you add the no-index meta-robots tag, you're going to need to remove the robots.txt disallow directive. Otherwise the crawlers won't make any attempt to crawl all the pages and so won't even discover most of the no-index requests.
Paul
[Edited to add - there's no reason you can't implement the no-index meta-tags and then also again request removal via the Webmaster Tools removal tool. Kind of a "belt & suspenders approach. The removal request will get it out quicker, and the meta-no-index will do the job of keeping it out. Remember to do this in Bing Webmaster Tools as well.]
-
Wouldn't a noindex meta tag on each page take care of it?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website dropped out from Google index
Howdy, fellow mozzers. I got approached by my friend - their website is https://www.hauteheadquarters.com She is saying that they dropped from google index over night - and, as you can see if you google their name, website url or even site: , most of the pages are not indexed. Home page is nowhere to be found - that's for sure. I know that they were indexed before. Google webmaster tools don't have any manual actions (at least yet). No sudden changes in content or backlink profile. robots.txt has some weird rule - disallow everything for EtaoSpider. I don't know if google would listen to that - robots checker in GWT says it's all good. Any ideas why that happen? Any ideas what I should check? P.S. Just noticed in GWT there was a huge drop in indexed pages within first week of August. Still no idea why though. P.P.S. Just noticed that there is noindex x-robots-tag in headers... Anyone knows where this can be set?
Intermediate & Advanced SEO | | DmitriiK0 -
What's the best way to A/B test new version of your website having different URL structure?
Hi Mozzers, Hope you're doing good. Well, we have a website, up and running for a decent tenure with millions of pages indexed in search engines. We're planning to go live with a new version of it i.e a new experience for our users, some changes in site architecture which includes change in URL structure for existing URLs and introduction of some new URLs as well. Now, my question is, what's the best way to do a A/B test with the new version? We can't launch it for a part of users (say, we'll make it live for 50% of the users, an remaining 50% of the users will see old/existing site only) because the URL structure is changed now and bots will get confused if they start landing on different versions. Will this work if I reduce crawl rate to ZERO during this A/B tenure? How will this impact us from SEO perspective? How will those old to new 301 URL redirects will affect our users? Have you ever faced/handled this kind of scenario? If yes, please share how you handled this along with the impact. If this is something new to you, would love to know your recommendations before taking the final call on this. Note: We're taking care of all existing URLs, properly 301 redirecting them to their newer versions but there are some new URLs which are supported only on newer version (architectural changes I mentioned above), and these URLs aren't backward compatible, can't redirect them to a valid URL on old version.
Intermediate & Advanced SEO | | _nitman0 -
Is there an automated way to test what HREFLANG is ranking for in google and yandex?
Hi everyone, We implemented HREFLANG code for our international sites. We are wondering is there an automated way to test is HREFLANG is working vs. manually browsing in each international search engine? Also, we implemented this a few days ago, and google webmaster tools stlil hasn't picked up that we have it implemented. I've heard it taking anywhere from 2-8 days. At what point would we see results. our site is http://www.datacard.com Is there an order that the site listings have to follow, for example should x-default be the last item listed? Thanks, Laura
Intermediate & Advanced SEO | | lauramrobinson320 -
Https & http urls in Google Index
Hi everyone, this question is a two parter: I am now working for a large website - over 500k monthly organic traffic. The site currently has both http and https urls in Google's index. The website has not formally converted to https. The https began with an error and has evolved unchecked over time. Both versions of the site (http & https) are registered in webmaster tools so I can clearly track and see that as time passes http indexation is decreasing and https has been increasing. The ratio is at about 3:1 in favor of https at this time. Traffic over the last year has slowly dipped, however, over the last two months there has been a steady decline in overall visits registered through analytics. No single page appears to be the culprit, this decline is occurring across most pages of the website, pages which traditionally draw heavy traffic - including the home page. Considering that Google is giving priority to https pages, could it be possible that the split is having a negative impact on traffic as rankings sway? Additionally, mobile activity for the site has steadily increased both from a traffic and a conversion standpoint. However that traffic has also dipped significantly over the last two months. Looking at Google's mobile usability error's page I see a significant number of errors (over 1k). I know Google has been testing and changing mobile ranking factors, is it safe to posit that this could be having an impact on mobile traffic? The traffic declines are 9-10% MOM. Thank you. ~Geo
Intermediate & Advanced SEO | | Geosem0 -
JavaScript Issue? Google not indexing a microsite
We have a microsite that was created on our domain but is not linked to from ANYwhere EXCEPT within some Javascript elements on pages on our site. The link is in one JQuery slide panel. The microsite is not being indexed at all - when i do site:(microsite name) on Google, it doesn't return anything. I think it's because the link's only in a Java element, but my client assures me that if I submit to Google for crawling the problem will be solved. Maybe so, but my point is that if you just create a simple HTML link from at least one of our site pages, it will get indexed no problem. The microsite has been up for months and it's still not being indexed - another newer microsite that's been up for a few weeks and has simple links to it from our pages is indexing fine. I have submitted the URL for crawling but had to use the google.com/webmasters/tools/submit-url/ method as I don't have access to the top level domain WMT account. p.s. when we put the microsite URL into the SEOBook spider-test tool it returns lots of lovely information - but that just tells me the page is findable, does exist, right? That doesn't mean Google's going to necessarily index it, as I am surmising...Moz hasn't found in the 5 months the microsite has been up and running. What's going on here?
Intermediate & Advanced SEO | | Jen_Floyd0 -
How long takes to a page show up in Google results after removing noindex from a page?
Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks.
Intermediate & Advanced SEO | | fabioricotta-840380 -
Google is Really Slow to Index my New Website
(Sorry for my english!) A quick background: I had a website at thewebhostinghero.com which had been slapped left and right by Google (both Panda & Penguin). It also had a manual penalty for unnatural links which had been lifted in late april / early may this year. I also had another domain, webhostinghero.com, which was redirecting to thewebhostinghero.com. When I realized I would be better off starting a new website than trying to salvage thewebhostinghero.com, I removed the redirection from webhostinghero.com and started building a new website. I waited about 5 or 6 weeks before putting any content on webhostinghero.com so Google had time to notice that the domain wasn't redirecting anymore. So about a month ago, I launched http://www.webhostinghero.com with 100% new content but I left thewebhostinghero.com online because it still brings a little (necessary) income. There are no links between the websites except on one page (www.thewebhostinghero.com/speed/) which is set to "noindex,nofollow" and is disallowed to search engines in robots.txt. I made sure the web page was deindexed before adding a "nofollow" link from thewebhostinghero.com/speed => webhostinghero.com/speed Since the new website launch, I've been publishing new content (from 2 to 5 posts) daily. It's getting some traction from social networks but it gets barely any clicks from Google search. It seems to take at least a week before Google indexes new posts and not all posts are indexed. The cached copy of the homepage is 12 days old. In Google Webmaster Tools, it looks like Google isn't getting the latest sitemap version unless I resubmit it manually. It's always 4 or 5 days old. So is my website just too young or could it have some kind of penalty related to the old website? The domain has 4 or 5 really old spammy links from the previous domain owner which I couldn't get rid of but otherwise I don't think there's anything tragic.
Intermediate & Advanced SEO | | sbrault740 -
Best way to geo redirect
Hi I have a couple of ecommerce websites which have both a UK and USA store. At the moment I have both the UK and the USA domains sending me traffic from UK and USA search engines which means that a number of users are clicking a Google page for the store not in their location, ie UK people are clicking on a .com listing and ending up on the USA website. What is the best way to automatically redirect people to the correct store for their region? If I use an IP based auto redirect system would Google see some of the pages are doorway pages? Thanks
Intermediate & Advanced SEO | | Grumpy_Carl0