How to Remove /feed URLs from Google's Index
-
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these:
<generator>http://wordpress.org/?v=3.5.2</generator>
Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses.
My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore.
I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index.
FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all.
Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
-
I tried many different htaccess file codings (such as recommended here), but they didn't work. Had to succumb to using the outdated Meta Robots plugin by Yoast, which can add the "noindex" code to the http header of /feed/ URLs. But, at least it's a solution: http://wordpress.org/plugins/robots-meta/. Hopefully this helps someone else.
-
I believe I found the solution: implement an x-robots-tag into the HTTP header of the various feed URLs. But, I need some help with creating the code to place in my .htaccess file. Any takers?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Confused about repeated occurences of URL/essayorg/topic/ showing up as 404 errors in our site logs
Working on a Wordpress website, https://thedoctorwithin.comScanning the site’s 404 errors, I’m seeing a lot of searches for URL/essayorg/topic, coming from Bingbot, as well as other spiders (Google, OpensiteExlorer). We get at least 200 of these irrelevant requests per week. Seems like each topic that follows /essayorg/ is unique. Some include typos: /dissitation/Haven't done a verification to make sure the spiders are who they say they are, yet.Almost seems like there are many links ‘in the wild’ intended for Essay.Org that are being directed towards the site I’m working on.I've considered redirecting any requests for URL/essayorg/ to our sitemap… figuring that might encourage further spidering of actual site content. Is redirection to our sitemap xml file a good idea, or might doing so have unintended consequences? Interested in suggestions about why this might be occurring. Thank you.
Technical SEO | | linkjuiced0 -
Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
Hi- I have a client that had thousands of dynamic php pages indexed by Google that shouldn't have been. He has since blocked these php pages via robots.txt disallow. Unfortunately, many of those php pages were linked to by high quality sites mulitiple times (instead of the static urls) before he put up the php 'disallow'. If we create 301 redirects for some of these php URLs that area still showing high value backlinks and send them to the correct static URLs, will Google even see these 301 redirects and pass link value to the proper static URLs? Or will the robots.txt keep Google away and we lose all these high quality backlinks? I guess the same question applies if we use the canonical tag instead of the 301. Will the robots.txt keep Google from seeing the canonical tags on the php pages? Thanks very much, V
Technical SEO | | Voodak0 -
Weird problems with google's rich snippet markup
Once upon a time, our site was ranking well and had all the markups showing up in the results. We than lost some of our rankings due to dropped links and not so well kept maintenance. Now, we are gaining up the rankings again, but the markups don't show up in the organic search results. When we Google site:oursite.com, the markups show up, but not in the organic search. There are no manual actions against our site. any idea why this would happen?
Technical SEO | | s-s0 -
Sitemap issue? 404's & 500's are regenerating?
I am using the WordPress SEO plugin by Yoast to generate a sitemap on http://www.atozqualityfencing.com. Last month, I had an associate create redirects for over 200 404 errors. She did this via the .htaccess file. Today, there are the same amount of 404s along with a number of 503 errors. This new Wordpress website was constructed on a subdirectory and made live by simply entering some code into the .htaccess file in order to direct browsers to the content we wanted live. In other words, the content actually resides in a subdirectory titled "newsite" but is shown live on the main url. Can you tell me why we are having these 404 & 503 errors? I have no idea where to begin looking.
Technical SEO | | JanetJ0 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below, Can you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
Technical SEO | | CommercePundit0 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
I cannot find a way to implement to the 2 Link method as shown in this post: http://searchengineland.com/the-definitive-guide-to-google-authorship-markup-123218
Did Google stop offering the 2 link method of verification for Authorship? See this post below: http://searchengineland.com/the-definitive-guide-to-google-authorship-markup-123218 And see this: http://www.seomoz.org/blog/using-passive-link-building-to-build-links-with-no-budget In both articles the authors talk about how to set up Authorship snippets for posts on blogs where they have no bio page and no email verification just by linking directly from the content to their Google+ profile and then by linking the from the the Google+ profile page (in the Contributor to section) to the blog home page. But this does not work no matter how many ways I trie it. Did Google stop offering this method?
Technical SEO | | jeff.interactive0 -
How to keep a URL social equity during a URL structure/name change?
We are in the process of making significant URL name/structure change to one of our property and we want to keep the social equity (likes, share, +1, tweets) from the old to the new URL. We have been trying many different option without success. We are running our social "button" in an iframe. Thanks
Technical SEO | | OlivierChateau0