Meta robots at every page rather than using robots.txt for blocking crawlers? How they'll get indexed if we block crawlers?
-
Hi all,
The suggestion to use meta robots tag rather than robots.txt file is to make sure the pages do not get indexed if their hyperlinks are available anywhere on the internet. I don't understand how the pages will be indexed if the entire site is blocked? Even though there are page links are available, will Google really index those pages? One of our site got blocked from robots file but internal links are available on internet for years which are not been indexed. So technically robots.txt file is quite enough right? Please clarify and guide me if I'm wrong.
Thanks
-
I agree with Gaston's approach right up to step 4. If you add the no-indexed pages back into a block in the robots.txt file, you'll end up back where you started from. Because Google will still discover the no-indexed URLs elsewhere and the robots,txt block will stop them from discovering the no-index, and the URLs will likely start to get added to the index again.
No-indexed URLs must not be blocked in robots.txt. Those two processes are mutually exclusive.
-
Hi there,
TLDR; The solution to deindexing and never index again:
- Allow (with robots.txt) the web to be crawable
- Aplly meta robots tag: noindex,follow
- Wait somte weeks to be completely deindexed
- block the entire site/section with robots.txt
Robots.txt and the robots meta tag can make the same effect, but to understand them must be analyzed separatedly.
-
Robots.txt, here you just tell bots where they can go BEFORE they crawl any of the website. This is just a signal, not a directive... Because robots can choose to ignore the what's in the file. Here you can block from the entire web, to an entire section or just specific pages. More info: Robots.txt official page and a really cool and complete guide to robots.txt
-
Robots meta tag, with it you have more signals to tell, the most used are: noindex, nofollow and follow, due to the usual issues about indexing. More info: Robots.txt offical page, Google developers, Meta Robots directive - Moz and a complete guide to meta robots tag - YOAST.
Hope this is what you wanted.
Best luck
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexed Pages Increase and Major Drop June 25th and July 16th?
I am seeing information regarding a possible Google algorithm that may have taken place on June 25th...and seeing total number of pages indexed in GSC increase (cool!)...BUT, then on July 16, I'm seeing a consistent drop (BIG DROP) of pages indexed - not only on our site, but several. Does anyone have any insight into this or experiencing the same issue?
Algorithm Updates | | kwilgus0 -
Many meta descriptions ignored by Google
Hi all, We have recently added the meta descriptions for more than 50 pages of our website. It's been more than a week and all the pages have been indexed. But still I can see most of the pages in Google results didn't show up with recently added meta description, but the content from page like how it used to be. I wonder what's wrong with this scenario. Please guide of someone aware of this. Thanks
Algorithm Updates | | vtmoz0 -
Meta Titles and Descriptions
Is there no use of writing Meta Titles and Descriptions? I have them added to 100's of my articles but Google picks the actual title from the URL and title in the article only... any reason for this? How can i check if my Meta title and Description are being seen by Search engines but it will just take time to index the new metas may be few more months to change in Search results.
Algorithm Updates | | skandlikp90 -
Google indexing site content that I did not wish to be indexed
Hi is it pretty standard for Google to index content that you have not specifically asked them to index i.e. provided them notification of a page's existence. I have just been alerted by 'Mention' about some new content that they have discovered, the page is on our site yes and may be I should have set it to NO INDEX but the page only went up a couple of days ago and I was making it live so that someone could look at it and see how the page was going to look in its final iteration. Normally we go through the usual process of notifying Google via GWMT, adding it to our site map.xml file, publishing it via our G+ stream and so on. Reviewing our Analytics it looks like there has been no traffic to this page yet and I know for a fact there are no links to this page. I am surprised at the speed of the indexation, is it a example of brand mention? Where an actual link is now no longer required? Cheers David
Algorithm Updates | | David-E-Carey0 -
Images not getting indexed in google image search :( " site: hdwallpaperzones.com " )
hi as i have mentioned in title.. my website images are not getting indexed in google image search engine.. out of 360 images only 5 got indexed from 3 days.. please help me out.. thanks
Algorithm Updates | | toxicpls0 -
Don't use an h1 and just use h2's?
We just overhauled our site and as I was auditing the overhaul I noticed that there were no h1's on any of the pages. I asked the company that does our programming why and he responded that h1's are spammed so much so he doesn't want to put them in. Instead he put in h2's. I can't find anything to back this up. I can find that h1's are over-optimized but nothing that says to skip them altogether. I think he's crazy. Anyone have anything to back him up?
Algorithm Updates | | Dave_Whitty0 -
Indexing well in Google but not in Yahoo/Bing - WHY?
Been using SEOMOZ now to analyze and crawl a client's website for a while now. One thing I've noticed is that our client's website is indexing well with Google. a few thousand pages are being indexed. However, when it comes to Yahoo and Bing, the website only has a 100+ pages indexed. We've submitted updated sitemaps to Google and Bing and have been fixing any broken links, and on-page SEO. Content is also good. Here's the website: www.imaginet.com.ph Any suggestions/recommendations are highly appreciated. Thank you!
Algorithm Updates | | TheNorthernOffice790 -
When Panda's attack...
I have a predicament. The site I manage (www.duhaime.org) has been hit by the Panda update but the system seems fixed against this site’s purpose. I need some advice on what i'm planning and what could be done. First, the issues: Content Length The site is legal reference including dictionary and citation look up. Hundreds (perhaps upwards of 1000) of pages, by virtue of the content, are thin. The acronym C.B.N.S. stands for “Common Bench Reports, New Series” a part of the English reports. There really isn’t too much more to say nor is there much value to the target audience in saying it. Visit Length as a Metric There is chatter claiming Google watches how long a person uses a page to gauge it’s value. Fair enough but, a large number of people that visit this site are looking for one small piece of data. They want the definition of a term or citation then they return to whatever caused the query in the first place. My strategy so far… Noindex some Pages Identify terms and citations that are really small – less than 500 characters – and put a no index tag on them. I will also remove the directory links to the pages and clean the sitemaps. This should remove the obviously troublesome pages. We’ll have to live with the fact these page won’t be found in Google’s index despite their value. Create more click incentives We already started with related terms and now we are looking at diagrams and images. Anything to punch up the content for that ever important second click. Expand Content (of course) The author will focus the next six months on doing his best to extend the content of these short pages. There are images and text to be added in many cases – perhaps 200 pages. Still won't be able to cover them all without heavy cut-n-paste feel. Site Redesign Looking to lighten up the code and boiler plate content shortly. We were working on this anyway. Resulting pages should have less than 15 hard-coded site-wide links and the disclaimer will be loaded with AJAX upon scroll. Ads units will be kept at 3 per page. What do you think? Are the super light pages of the citations and dictionary why site traffic is down 35% this week?
Algorithm Updates | | sprynewmedia0