Should comments and feeds be disallowed in robots.txt?

workathomecareers

Hi

My robots file is currently set up as listed below.

From an SEO point of view is it good to disallow feeds, rss and comments?

I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly.

What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback.

Thanks.

Eddy

User-agent: Googlebot
Crawl-delay: 10
Allow: /*

User-agent: *
Crawl-delay: 10
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/
Disallow: /rss/
Disallow: /comments/feed/
Disallow: /page/
Disallow: /date/
Disallow: /comments/

# Allow Everything
Allow: /*

FedeEinhorn

If I were going to disallow something I would go with noindex tags. The robots file is perfect with just those 2 lines.

Then, there are some plugins that will help you avoid any SEO issue like SEO by Yoast. Personally I like to noindex,follow tags, categories, and archive pages, that's it. But again, noindex, follow with a robots tag on the page, not using the robots.txt. SEO by Yoast will make that as easy as it can ever be with just a small configuration steps.

Give it a try, you can always disable plugins

Wish you the best!

DaveSottimano

Wordpress is a funny platform, you would think that there isn't much to disallow but there probably is quite a bit. I agree with Federico - you should allow comments, feed, and rss.

I'm not going to make blind assumptions here, so you should check your log files to see what's being constantly crawled, feel free to read this http://moz.com/blog/server-log-essentials-for-seo.

FYI - This is a big job. Shout if you need help.

P.S - Hostgator's Cpanel will allow you to archive raw server logs, make sure you check that option from now on or they'll be overwritten!

workathomecareers

Thanks for the info!

I contacted Hostgator to fix the robots file because it had been blocking Google's bot for some time now. So that's the robot file they uploaded.

Yes I use wordpress, and apparently some stupid plugin had originally blocked google before hostgator fixed the robots file yesterday.

So to confirm you don't think anything else should be disallowed except for the /wp-admin directory. With the feeds, comments, etc, there isn't any SEO concerns like duplicate content or anything else that may work against me that should be blocked.

Is this safe to assume?

Thanks again!

Eddy

FedeEinhorn

Who wrote that robots.txt?

You shouldn't disallow the comments, or feed or almost anything.

I notice you are using wordpress, so if you just want to avoid the admin being indexed (which will isn't going to be as Google does not have access anyway), your robots.txt should look like this:

User-Agent:*

Disallow: /wp-admin/

That's it.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Should comments and feeds be disallowed in robots.txt?

Browse Questions

Explore more categories

Related Questions

What does Disallow: /french-wines/?* actually do - robots.txt

If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?

¿Disallow duplicate URL?

Pros & Cons Of Closing Forum Discussions To New Comments

Robots.txt for Facet Results

When you add 10.000 pages that have no real intention to rank in the SERP, should you: "follow,noindex" or disallow the whole directory through robots? What is your opinion?

Robots.txt error message in Google Webmaster from a later date than the page was cached, how is that?

Could you use a robots.txt file to disalow a duplicate content page from being crawled?