How to Remove /feed URLs from Google's Index
-
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these:
<generator>http://wordpress.org/?v=3.5.2</generator>
Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses.
My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore.
I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index.
FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all.
Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
-
I tried many different htaccess file codings (such as recommended here), but they didn't work. Had to succumb to using the outdated Meta Robots plugin by Yoast, which can add the "noindex" code to the http header of /feed/ URLs. But, at least it's a solution: http://wordpress.org/plugins/robots-meta/. Hopefully this helps someone else.
-
I believe I found the solution: implement an x-robots-tag into the HTTP header of the various feed URLs. But, I need some help with creating the code to place in my .htaccess file. Any takers?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Website's Home Page is Missing on Google SERP
Hi All, I have a WordPress website which has about 10-12 pages in total. When I search for the brand name on Google Search, the home page URL isn't appearing on the result pages while the rest of the pages are appearing. There're no issues with the canonicalization or meta titles/descriptions as such. What could possibly the reason behind this aberration? Looking forward to your advice! Cheers
Technical SEO | | ugorayan0 -
Removing site subdomains from Google search
Hi everyone, I hope you are having a good week? My website has several subdomains that I had shut down some time back and pages on these subdomains are still appearing in the Google search result pages. I want all the URLs from these subdomains to stop appearing in the Google search result pages and I was hoping to see if anyone can help me with this. The subdomains are no longer under my control as I don't have web hosting for these sites (so these subdomain sites just show a default hosting server page). Because of this, I cannot verify these in search console and submit a url/site removal request to Google. In total, there are about 70 pages from these subdomains showing up in Google at the moment and I'm concerned in case these pages have any negative impacts on my SEO. Thanks for taking the time to read my post.
Technical SEO | | QuantumWeb620 -
Will Google Recrawl an Indexed URL Which is No Longer Internally Linked?
We accidentally introduced Google to our incomplete site. The end result: thousands of pages indexed which return nothing but a "Sorry, no results" page. I know there are many ways to go about this, but the sheer number of pages makes it frustrating. Ideally, in the interim, I'd love to 404 the offending pages and allow Google to recrawl them, realize they're dead, and begin removing them from the index. Unfortunately, we've removed the initial internal links that lead to this premature indexation from our site. So my question is, will Google revisit these pages based on their own records (as in, this page is indexed, let's go check it out again!), or will they only revisit them by following along a current site structure? We are signed up with WMT if that helps.
Technical SEO | | kirmeliux0 -
Is it worth changing our blog post URL's?
We're considering changing the URL's for our blog posts and dropping the date information. Ex. http://spreecommerce.com/blog/2012/07/27/spree-1-1-3-released/ changes to http://spreecommerce.com/blog/spree-1-1-3-released/ Based on what I've learned here the new URL is better for SEO but since these pages already exist do we risk a minor loss of Google juice with 301 redirects? We have a sitemap for the blog posts so I imagine this wouldn't be too hard for Google to learn the new ones.
Technical SEO | | schof0 -
What is Google's Penguin effect on SEO?
I want to know about Google's Penguin. Specially, how it works to protect spam links <seo>or other jobs. </seo> How I can protect this problem. Kind Regards John
Technical SEO | | JohnDooley0 -
Sitemap coming up in Google's index?
I apologize if this question's answer is glaringly obvious, but I was using Google to view all the pages it has indexed of our site--by searching for our company and then clicking the link that says to display more results for the site. On page three, it has the sitemap indexed as if it wee just another page of our site. <cite>www.stadriemblems.com/sitemap.xml</cite> Is this supposed to happen?
Technical SEO | | UnderRugSwept0 -
Cantags within links affect Google's perception of them?
Hi, All! This might be really obvious, but I have little coding experience, so when in doubt - ask... One of our client site's has navigation that looks (in part) like this: <a <span="">href</a><a <span="">="http://www.mysite.com/section1"></a> <a <span="">src="images/arrow6.gif" width="13" height="7" alt="Section 1">Section 1</a><a <span=""></a> WC3 told us the tags invalidate, and while I ignored most of their comments because I didn't think it would impact on what search engines saw, because thesetags are right in the links, it raised a question. Anyone know if this is for sure a problem/not a problem? Thanks in advance! Aviva B
Technical SEO | | debi_zyx0