Robots.txt vs noindex
-
I recently started working on a site that has thousands of member pages that are currently robots.txt'd out.
Most pages of the site have 1 to 6 links to these member pages, accumulating into what I regard as something of link juice cul-d-sac.
The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search.
Wouldn't it be better to "noindex, follow" these pages and remove the robots.txt block from this url type? At least that way Google could crawl these pages and pass the link juice on to still other pages vs flushing it into a black hole.
BTW, the site is currently dealing with a hit from Panda 4.0 last month.
Thanks! Best... Darcy
-
if you add the meta noindex, follow tag , it will keep the page out of the SERP but allows pagerank to flow through them to other pages.
See this interview of Matt Cutts for more info : http://www.stonetemple.com/articles/interview-matt-cutts.shtml
-
Hi Saijo,
Thanks for the response. Do you think that would yield the benefit I'm looking for of recapturing that lost link juice?
Do you think there'd be any downside to the switcheroo from robots.txt to noindex, follow?
Best... Darcy
-
Since you said " The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search. " I would use meta robots "noindex, follow"
-
HI Lesley,
Thanks for the thoughts. I don't see this as a real option for a number of reasons, including but not limited to that there are 50,000 profiles, most with very little information. The members of this site are 95% busy professionals who aren't trying to advance their career via their profile. So, there'd be some privacy concern and the potential for tens of thousands of low content/highly templated pages. Not really a search dream come true!
Also, converting it into a system where different levels of profile completeness are acknowledged would not really resonate with this community nor would it be near the top of our engineering priorities.
What I really want to get clear on is how best to keep them search invisible while not losing link value into a robots.txt'd black hole. Really just looking for confirmation if, with those goals, "noindex, follow" and remove from robots is the way to go. I'm pretty sure it is, but would like to hear more about that.
Thanks... Darcy
-
I think what I am going to say is going to sound like it is going against the grain, but it really isn't. I have noticed in some places if you want an active community, you reward your members. Look at how moz does their forum, they don't really noindex the pages, but once you hit a point they psuedo drop the nofollow off of your profile link (it could be argued whether they really do). But the point is reward your members that are active. I would set up some automatic noindex tag in the header that grabbed the users post numbers. Then you can noindex all of the spammers and have prominent members shown in the search. If it were me that is how I would do it. I have a PA of 49 on my profile in one forum I regular, I have seen the stats, it is regularly an entry page to the forum. Another member has a 64 on a 93 domain, his is used a lot more than mine for entry as well. Think of it this way, if someone is googling my name, the second result is http://screencast.com/t/jIx7a4hcWV Moz's forum. 2nd search results still get a lot of clicks.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dealing with non-canonical http vs https?
We're working on a complete rebuild of a client's site. The existing version of the site is in WordPress and I've noticed that the site is accessible via http and https. The new version of the site will have mostly or entirely different URLs. It seems that both http and https versions of a page will resolve, but all of the rel-canonical tags I've seen point to the https version. Sometimes image tags and stylesheets are https, sometimes they aren't. There are both http and https pages in Google's index. Having looked at other community posts about http/https, I've gathered the following: http/https is like two different domains. http and https versions need to be verified in Google Webmaster Tools separately. Set up the preferred domain properly. Rel-canonicals and internal links should have matching protocols. My thought is that we will do a .htaccess that redirects old URLs regardless of the protocol to new pages at one protocol. I would probably let the .css and image files from the current site 404. When we develop and launch the new site, does it make sense for everything to be forced to https? Are there any particular SEO issues that I should be aware of for a scenario like this? Thanks!
Intermediate & Advanced SEO | | GOODSIR0 -
I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!
Hi, I've just found two sitemaps - one of them is .php and represents part of the site structure on the website. The second is a .txt file which lists every page on the website. The .txt file is blocked via robots exclusion protocol (which doesn't appear to be very logical as it's the only full sitemap). Any ideas why a developer might have done that?
Intermediate & Advanced SEO | | McTaggart0 -
301 vs 410 redirect: What to use when removing a URL from the website
We are in the process of detemining how to handle URLs that are completely removed from our website? Think of these as listings that have an expiration date (i.e. http://www.noodle.org/test-prep/tphU3/sat-group-course). What is the best practice for removing these listings (assuming not many people are linking to them externally). 301 to a general page (i.e. http://www.noodle.org/search/test-prep) Do nothing and leave them up but remove from the site map (as they are no longer useful from a user perspective) return a 404 or 410?
Intermediate & Advanced SEO | | abargmann0 -
Webmaster Tools: Total Indexed VS Ever Crawled
Ok, In WMT's under health > index status I have both total indexed and ever crawled ticked - It also looks like the data is broken up weekly. As an example say you have the following: Total Indexed: 1000 Ever Crawled: 5000 What is this say? It found 5000 pages but only indexed 1000 (20%). Thanks
Intermediate & Advanced SEO | | Bondara0 -
Why should I add URL parameters where Meta Robots NOINDEX available?
Today, I have checked Bing webmaster tools and come to know about Ignore URL parameters. Bing webmaster tools shows me certain parameters for URLs where I have added META Robots with NOINDEX FOLLOW syntax. I can see canopy_search_fabric parameter in suggested section. It's due to following kind or URLs. http://www.vistastores.com/patio-umbrellas?canopy_fabric_search=1728 http://www.vistastores.com/patio-umbrellas?canopy_fabric_search=1729 http://www.vistastores.com/patio-umbrellas?canopy_fabric_search=1730 http://www.vistastores.com/patio-umbrellas?canopy_fabric_search=2239 But, I have added META Robots NOINDEX Follow to disallow crawling. So, why should it happen?
Intermediate & Advanced SEO | | CommercePundit0 -
Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?
I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's for each page. To avoid duplicates, all of the existing pages were added to the robots file. That said, it has now been over a week - I know Google has recrawled the site - and when I search for term X, it is stil the old page that is ranking, with the new one nowhere to be seen. I'm assuming it's a cached version, but why are so many of the old pages still appearing in the index? Furthermore, all "tags" pages (it's a Q&A site, like this one) were also added to the robots a few months ago, yet I think they are all still appearing in the index. Anyone got any ideas about why this is happening, and how I can get my new pages indexed?
Intermediate & Advanced SEO | | corp08030 -
Links in Behind a Tab in Body vs Footer
Though I would ask the community this as it relates to positioning of external links on page vs. code and body vs footer. Working with a strategic partner for some data sharing arrangements and the question exists whether followed links in the template footer to our partners website provide better value vs. Body Links with context behind a CSS Tab (All code for Tabbed content is resolved on the same page)? Yes there are links coming back from the partner site as well.
Intermediate & Advanced SEO | | AU-SEO0 -
Can I add NOFOLLOW or NOINDEX attribute for better organic ranking?
I am working on online retail store which is highly dedicated to Patio Umbrellas. My website is on 2nd page of Google web search for Patio Umbrellas keyword. I have one another internal page with Patio Umbrellas text link. http://www.vistapatioumbrellas.com/21/patio-umbrellas.html I assume that, Google have confusion to give rank for my keyword during Patio Umbrellas keyword. I want to set NOFOLLOW attribute or NOINDEX FOLLOW meta for this page. Will it help me to rank high for Patio Umbrellas keyword. My ultimate goal is to reduce confusion for Patio Umbrellas keyword.
Intermediate & Advanced SEO | | CommercePundit0