Panda Updates - robots.txt or noindex?
-
Hi,
I have a site that I believe has been impacted by the recent Panda updates. Assuming that Google has crawled and indexed several thousand pages that are essentially the same and the site has now passed the threshold to be picked out by the Panda update, what is the best way to proceed?
Is it enough to block the pages from being crawled in the future using robots.txt, or would I need to remove the pages from the index using the meta noindex tag? Of course if I block the URLs with robots.txt then Googlebot won't be able to access the page in order to see the noindex tag.
Anyone have and previous experiences of doing something similar?
Thanks very much.
-
This is a good read. http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world I think you should be careful with robot.txt because blocking access to the bot will not cause them to remove the content from their index. They will simply include a message saying not quite sure what's on this page.. I would use noindex to clear out the index first before attempting robot.txt exclusion.
-
Yes, both because if a page is linked to on another site google with spider that other site and follow your link without hitting the robots.txt and the page could get indexed if there is not a noindex on it.
-
Indeed try both.
Irving +1
-
both. block the lowest quality lowest traffic pages with nodindex and block the folder in robots.txt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know of a Google update in the past few days?
Have seen a fairly substantial drop in Google search console, I'm still looking into it comparing things, but does anyone know if there's been a Google updates within the past few days? Or has anyone else noticed anything? Thanks
Intermediate & Advanced SEO | | seoman100 -
A 302 Redirect Question | Events Page Updates
Hello Moz World, I have a client that has a TON, like close to a thousand pages that have a 302 redirect set up. After further investigation, I found that every month they update their events page & Demo Request page, and the old events pages still exist but, get a 302 redirect to the updated page. From what I gather, this is a default mechanism set up by the hosting provider. My questions; is this an example of when to use a Rel=canonical? Also, is there a method for integrating this without having to go into every page and integrate the code snippet? And Lastly, How should I go about ensuring this doesn't happen in the future? Thanks ahead of time, you guys rock! B/R Will H.
Intermediate & Advanced SEO | | MarketingChimp100 -
Question about Syntax in Robots.txt
So if I want to block any URL from being indexed that contains a particular parameter what is the best way to put this in the robots.txt file? Currently I have-
Intermediate & Advanced SEO | | DRSearchEngOpt
Disallow: /attachment_id Where "attachment_id" is the parameter. Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do Disallow: attachment_id or Disallow: attachment_id= but figured I would ask you guys first. Thanks!0 -
When you add 10.000 pages that have no real intention to rank in the SERP, should you: "follow,noindex" or disallow the whole directory through robots? What is your opinion?
I just want a second opinion 🙂 The customer don't want to loose any internal linkvalue by vaporizing link value though a big amount of internal links. What would you do?
Intermediate & Advanced SEO | | Zanox0 -
Penguin 2.0 update, ranking dropped. Advice needed!
Hello After another penguin 2.0 update the website i've been working on dropped in rankings,some of keywords that i ranked in #1 are now on second and third page, you can see this screenshot here http://screencast.com/t/MramoXgTr 95% of my competitors were not even effected with this update at all, most of them don't even optimize their website for SEO, rather they use paid directories. First thing i did is analyzed my backing profile using OSE, to my surprise i found a lot of low quality domains pointing to my pages with a keyword in anchor text. A lot of them blog commenting and low quality article directories. Since i don't have control over these links and i cant remove them i used Disavow tool to do the job. For the past 3 months, i've been doing a lot of hight quality link building; such as
Intermediate & Advanced SEO | | KentR
press releases once in 2 months, squidoo lens and hubpages 3 posts a week for each keyword, youtube video, in fact my youtube video still ranks in #3 for high competitive search, i was involved in social media, posting tweets every week and Facebook posts. I really hope that someone can help me here with a good advice on getting my rankings back here's my website, let me know what do you think about it. Thank You0 -
Could you use a robots.txt file to disalow a duplicate content page from being crawled?
A website has duplicate content pages to make it easier for users to find the information from a couple spots in the site navigation. Site owner would like to keep it this way without hurting SEO. I've thought of using the robots.txt file to disallow search engines from crawling one of the pages. Would you think this is a workable/acceptable solution?
Intermediate & Advanced SEO | | gregelwell0 -
How long will Google take to read my robots.txt after updating?
I updated www.egrecia.es/robots.txt two weeks ago and I still haven't solved Duplicate Title and Content on the website. The Google SERP doesn't show those urls any more but SEOMOZ Crawl Errors nor Google Webmaster Tools recognize the change. How long will it take?
Intermediate & Advanced SEO | | Tintanus0 -
Robots.txt 404 problem
I've just set up a wordpress site with a hosting company who only allow you to install your wordpress site in http://www.myurl.com/folder as opposed to the root folder. I now have the problem that the robots.txt file only works in http://www.myurl./com/folder/robots.txt Of course google is looking for it at http://www.myurl.com/robots.txt and returning a 404 error. How can I get around this? Is there a way to tell google in webmaster tools to use a different path to locate it? I'm stumped?
Intermediate & Advanced SEO | | SamCUK0