Could large number of "not selected" pages cause a penalty?
-
My site was penalized for specific pages in the UK On July 28 (corresponding with a Panda update).
I cleaned up my website and wrote to Google and they responded that "no manual spam actions had been taken".
The only other thing I can think of is that we suffered an automatic penalty.
I am having problems with my sitemap and it is indexing many error pages, empty pages, etc... According to our index status we have 2,679,794 not selected pages and 36,168 total indexed.
Could this have been what caused the error?
(If you have any articles to back up your answers that would be greatly appreciate)
Thanks!
-
Canonical tag to what? Themselves? Or the page they should be? Are these pages unique by some URL variables only? If so, you can instruct Google to ignore specific get variables to resolve this issue but you would also want to fix your sitemap woes: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687
This is where it gets sticky, these pages are certainly not helping and not being indexed, Google Webmaster tools shows us that, but if you have this problem, how many other technical problems could the site have?
We can be almost certain you have some kind of panda filter but to diagnose it further we would need a link and access to analytics to determine what has gone wrong and provide more detailed guidance to resolve the issues.
This could be a red herring and your problem could be elsewhere but with no examples we can only give very general responses. If this was my site I would certainly look to identify the most likely issues and work through this in a pragmatic way to eliminate possible issues and look at other potentials.
My advice would be to have the site analysed by someone with distinct experience with Panda penalties who can give you specific feedback on the problems and provide guidance to resolve them.
If the URL is sensitive and can't be shared here, I can offer this service and am in the UK. I am sure can several other users at SEOMoz can also help. I know Marie Haynes offers this service as I am sure Ryan Kent could help also.
Shout if you have any questions or can provide more details (or a url).
-
Hi,
Thanks for the detailed answer.
We have many duplicate pages, but they all have canonical tags on them... shouldn't that be solving the problem. Would pages with the canonical tag be showing up here?
-
Yes, this can definitely cause problems. In fact this is a common footprint in sites hit by the panda updates.
It sound like you have some sort of canonical issue on the site: Multiple copies of each page are being crawled. Google is finding lots of copies of the same thing, crawling them but deciding that they are not sufficiently unique/useful to keep in the index. I've been working on a number of sites hit with the same issue and clean up can be a real pain.
The best starting point for reading is probably this article here on SEOmoz : http://www.seomoz.org/learn-seo/duplicate-content . That article includes some useful links on how to diagnose and solve the issues as well, so be sure to check out all the linked resources.
-
Hey Sarah
There are always a lot of moving parts when it comes to penalties but the very fact that you lost traffic on a known panda date really points towards this being a Panda style of penalty. Panda, is an algorithmic penalty so you will not receive any kind of notification in Webmaster Tools and likewise, a re-inclusion request will not help, you have to fix the problem to resolve the issues.
The not selected pages are likely a big part of your problem. Google classes not selected pages as follows:
"Not selected: Pages that are not indexed because they are substantially similar to other pages, or that have been redirected to another URL. More information."
If you have the best part of 3 million of these pages that are 'substantially similar' to other pages then there is every change that this is a very big part of your problem.
Obviously, there are a lot of moving parts to this. This sounds highly likely this is part of your problem and just think how this looks to Google. 2.6 million pages that are duplicated. It is a low quality signal, a possible attempt at manipulation or god knows what else but what we do know, is that is unlikely to be a strong result for any search users so those pages have been dropped.
What to do?
Well, firstly, fix your site map and sort out these duplication problems. It's hard to give specifics without a link to the site in question but just sort this out. Apply the noindex tag dynamically if needs be, remove these duplicates from the sitemap, heck, remove the sitemap alltogether for a while if needs be till it is fixed. Just sort out these issues one way or another.
Happy to give more help here if I can but would need a link or some such to advise better.
Resources
You asked for some links but I am not completely sure what to provide here without a link but let me have a shot and provide some general points:
1. Good General Panda Overview from Dr. Pete
http://www.seomoz.org/blog/fat-pandas-and-thin-content
2. An overview of canonicalisation form Google
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139066
3. A way to diagnose and hopefully recover from Panda from John Doherty at distilled.
http://www.distilled.net/blog/seo/beating-the-panda-diagnosing-and-rescuing-a-clients-traffic/
4. Index Status Overview from Google
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=2642366
Summary
You have a serious problem here but hopefully one that can be resolved. Panda is a primarily focused at on page issues and this is an absolute doozy of an on page issue so sort it out and you should see a recovery. Keep in mind you have 75 times more problem pages than actual content pages at the moment in your site map so this may be the biggest case I have ever seen so I would be very keen to see how you get on and what happens when you resolve these issues as I am sure would the wider SEOMoz community.
Hope this helps & please fire over any questions.
Marcus
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When serving a 410 for page gone, should I serve an error page?
I'm removing a bunch of old & rubbish pages and was going to serve 410 to tell google they're gone (my understanding is it'll get them out of the index a bit quicker than a 404). I should still serve an error page though, right? Similar to a 404. That doesn't muddy the "gone" message that I'm giving Google? There's no need to 410 and die?
Intermediate & Advanced SEO | | HSDOnline0 -
Best Practices for Creating Back Links from "Thought Leader" Content
What is the best way to use articles from a "thought leader" to build high-quality links to my website? I have heard that it is possible to pay bloggers to post business articles that link back to a website. That assuming these blogs have domain authority this is a good technique to improve ranking. Is this in fact true, and if so where would I find blogs to post our content. The purpose would be to promote real estate brokerage website. Any suggestions? Is this possible, advisable, best use of quality content? Alternatively, where else can we post engaging content to create links back to our site? Social media? The nature of the content would be such topics as how to find the best value in Manhattan office of loft space rentals, etcera. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Recommend Layout Page (home, categories or section, individual page)
Hello Could you please share with me your advice and recommendations on how to design a SEO layout (H1, Image, body text, etc). I need to give instructions to our website designer. I would like to see some examples. We are going to work with wordpress and visual composer. I really appreciate your help and time Andy
Intermediate & Advanced SEO | | GHSCostaRica0 -
Difference in Number of URLS in "Crawl, Sitemaps" & "Index Status" in Webmaster Tools, NORMAL?
Greetings MOZ Community: Webmaster Tools under "Index Status" shows 850 URLs indexed for our website (www.nyc-officespace-leader.com). The number of URLs indexed jumped by around 175 around June 10th, shortly after we launched a new version of our website. No new URLs were added to the site upgrade. Under Webmaster Tools under "Crawl, Site maps", it shows 637 pages submitted and 599 indexed. Prior to June 6th there was not a significant difference in the number of pages shown between the "Index Status" and "Crawl. Site Maps". Now there is a differential of 175. The 850 URLs in "Index Status" is equal to the number of URLs in the MOZ domain crawl report I ran yesterday. Since this differential developed, ranking has declined sharply. Perhaps I am hit by the new version of Panda, but Google indexing junk pages (if that is in fact happening) could have something to do with it. Is this differential between the number of URLs shown in "Index Status" and "Crawl, Sitemaps" normal? I am attaching Images of the two screens from Webmaster Tools as well as the MOZ crawl to illustrate what has occurred. My developer seems stumped by this. He has submitted a removal request for the 175 URLs to Google, but they remain in the index. Any suggestions? Thanks,
Intermediate & Advanced SEO | | Kingalan1
Alan0 -
Subcategories within "New Arrivals" section - duplicate content?
Hi there, My client runs an e-commerce store selling shoes that features a section called "New Arrivals" with subcategories, such as "shoes," "wedges," "boots," "sandals," etc. There are already main subcategories on the site that target these terms. These are specifically pages for "New Arrivals - Boots," etc. The shoes listed on each new arrivals subcategory page are also listed in the main subcategory page. Given that there is not really any search volume for "Brand + new arrivals in boots," but lots of search volume for "Brand + boots," what is the proper way to handle these new arrivals subcategory pages? Should each subcategory have a rel=canonical tag pointing to the main subcategory? Should they be de-indexed? Should I keep them all indexed but try to make the content as unique as possible? Thank you!
Intermediate & Advanced SEO | | FPD_NYC0 -
"No Index" Extensions
Hi there, We run an e-commerce website and we are aware of our duplicate page content/title problems. We know about the "rel canonical" tag and the "no index" tag but I am more interested in the latter. We use a CMS called Magento. Now, Magento has an extension that allows you to use the "no follow" and "no index" tag on products. Google has indexed many of our pages and I wanted to know if applying the "no index" tag on duplicate pages will instruct Google to remove the duplicate url's it has already indexed. I know the tag will tell Google not to index a page but what if I apply it to a product already indexed?
Intermediate & Advanced SEO | | iBags0 -
When you add 10.000 pages that have no real intention to rank in the SERP, should you: "follow,noindex" or disallow the whole directory through robots? What is your opinion?
I just want a second opinion 🙂 The customer don't want to loose any internal linkvalue by vaporizing link value though a big amount of internal links. What would you do?
Intermediate & Advanced SEO | | Zanox0 -
Pagination: rel="next" rel="prev" in ?
With Google releasing that instructional on proper pagination I finally hunkered down and put in a site change request. I wanted the rel="next" and rel="prev" implemented… and it took two weeks for the guy to get it done. Brutal and painful. When I looked at the source it turned out he put it in the body above the pagination links… which is not what I wanted. I wanted them in the . Before I respond to get it properly implemented I want a few opinions - is it okay to have the rel="next" in the body? Or is it pretty much mandatory to put it in the head? (Normally, if I had full control over this site, I would just do it myself in 2 minutes… unfortunately I don't have that luxury with this site)
Intermediate & Advanced SEO | | BeTheBoss1