Can too many "noindex" pages compared to "index" pages be a problem?
-
Hello,
I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages.
Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow".
At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages.
Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter?
Any thoughts on this issue are very welcome.
Thank you!
Fabrizio
-
Julian, we sell digital sheet music and the additional 100,000 are products from Alfred music publishing company. Of course they will not be "high quality pages", but they are product pages, each one offering a piece of music. We are an e-commerce website, how can we avoid having product pages?! But of course, as Wesley said above, we can improve each product page quality content by giving more/custom information for each product, increasing user reviews, etc.
Other suggestions?
-
Thank you Wesley, yes, I think you are right. Our business is suffering really too much without traffic coming from the "noindex" pages, and after many months we still don't see recovery. I think the best approach would be probably to keep the pages in the index and differentiate them as much as we can.
Thank you!
-
Panda is probably the worst penalty to have. Very few site ever recover, even though site owner have spent a lot of time, effort and money trying to solve it. e.g. http://searchengineland.com/google-panda-two-years-later-losers-still-losing-one-real-recovery-149491
In this video, about 12.43 - matt cutts is clear, if you think its low quality 404 it, in other delete it.
May I ask why you want to keep these 180,000 pages live? And why are you planning to add another 100,000 pages? Surely they cant be high quality pages?
-
Fabrizo, as far as I know Google Panda is now part of the standard Google algorithm and it won't be a periodic event anymore. Penguin still is though.
If your product pages are duplicate content according to Google try and see if you can do something about that instead of no-indexing it. Is there no way you can update the products so they display a more prominent description? I understand that manually it's not a possibility because there are way too much products for that to be an option.
I did notice that on a lot of your product pages you have a standard text: "This item includes: PDF (digital sheet music to print), Scorch files (for online playing, transposition and printing), Videos, MIDI and Mp3 audio files (including <a title="This item includes Mp3 music accompaniment files.">Mp3 music accompaniment files</a>)*
Genre: classical
Skill Level: medium"Since this is basicly the only text on a lot of pages I think it's a big part of the problem. Maybe you can change this text so it looks different for every product?
Try tools like http://www.plagspotter.com/ to find the duplicate content and see which solution is best for your specific problem.
I hope i helped and if you need more help let me know
-
I understand what you mean and I agree with you in general, but specifically to our own website, I have no idea who put that link on that page, which is by the way a "nofollow" link. We never built links, all our incoming links are either natural and/or links from our own affiliates. I don't see much of "that stuff" on our back-link profile... am I in error?
Anyhow, yes, we are aware the situation is quite complex. Thank you again.
-
I actually looked at the competitors ranking #3 and #4 for the phrase "download sheet music" since your ranking 5th. Either way, its not a matter of too much or too little. It's how much of the link profile is authentic vs how much is made up of stuff like this....
http://www.dionneco.com/2011/02/love-is-a-parallax/
that's what I meant by fake links.
I think what you may be missing is how complex the situation really is. There's a lot more to be considered than a number in Open Site Explorer - which is actually only a portions of what's really out there.
You may also want to look at changes you can make on-site. I'm a firm believer that proper HTML, accessibility, UX and all that really matter.
-
Thank you Takeshi, I think you got the problem right. The "crawling" side of the issue is something I was thinking about too!
We are actually working on every aspect of our website to improve its content because we have suffered by Panda a lot in the past two years, so here is the strategy we begun to take since March:
1. "noindexing" most of our thin or almost-duplicate content to get it removed from the index
2. Improve our best content and differentiate it as much as we can with compelling content (this takes a long time!)
3. Consolidating similar pages with the use of canonical tags.
In order to tackle the "slower crawling" problem you have highlighted here, do you think that would be probably better for us to stop engines to crawl those pages altogether via robots.txt once they have been removed? Would that solve the crawl issue? I could do that at least with these new 100,000 new product pages we plan to add!
Thank you!
-
Wesley, that's because of being penalized by Panda several times in the past... so we are trying the "clean-up" strategy with the hope to be "de-penalized" by Panda at the next related algorithm update. Looks like we had too many "thin" or "almost duplicate" pages... that's why we removed so many pages from the index! But if we don't see improvements in the coming 1-2 months, I guess we'll put the product pages in the index because our business is suffering a big deal!
-
Colin, what do you mean with "fake links" exactly? Our link profile looks actually in better shape than our main competitors:
virtualsheetmusic.com (our site): links: 614,013 root domains: 2,233
sheetmusicplus.com (competitor): links: 5,322,596 root domains: 6,149 (worse than our profile!)
musicnotes.com (competitor): links: 6,527,429 root domains: 2,914 (much worse than our profile!)
Am I missing anything?
-
The discrepancy between noindexed/indexed pages is not in itself a problem. However having all those pages will present a challenge to Google, in terms of crawling. Even though the pages won't be indexed, Google will need to spend some of your limited crawl budget crawling all those pages.
Also, to recover from Panda it's necessary to not only noindex duplicate content, but improve your indexed content. That means things like consolidating similar pages into one page, writing unique content for your pages, and getting unique user-generated content such as reviews.
-
Why would you want to no-index your product pages? They seem like the kind of pages you want to get found on.
There shouldn't be a problem between the amount of index pages VS no-index pages except you won't get found on the no-index ones. Product pages tend to be the kind of pages that you REALLY want to get found on.
I think you should rethink your strategy to recover from the penalties.
Try to find out where exactly the penalties came from and fix the errors in that area of our website. -
Can't say I've been in that situation, but search engines seem to interpret that tag as an on/off situation. and I think you probably know that your problems aren't related to or able to be solved by robots meta tags.
You need less fake links. OSE finds well over half a million links from 3K root domains to your site. Look at your competitors - a few thousand links from a handful of domains.
It's a shame because it seems like the internet wanted to make you the authority naturally - You've got a handful of really solid links coming in. If you could shed the spam somehow you'd be doing a lot better.
So yea, stating the obvious, I know. best of luck to you and hope the site recovers!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does redirecting from a "bad" domain "infect" the new domain?
Hi all, So a complicated question that requires a little background. I bought unseenjapan.com to serve as a legitimate news site about a year ago. Social media and content growth has been good. Unfortunately, one thing I didn't realize when I bought this domain was that it used to be a porn site. I've managed to muck out some of the damage already - primarily, I got major vendors like Macafee and OpenDNS to remove the "porn" categorization, which has unblocked the site at most schools & locations w/ public wifi. The sticky bit, however, is Google. Google has the domain filtered under SafeSearch, which means we're losing - and will continue to lose - a ton of organic traffic. I'm trying to figure out how to deal with this, and appeal the decision. Unfortunately, Google's Reconsideration Request form currently doesn't work unless your site has an existing manual action against it (mine does not). I've also heard such requests, even if I did figure out how to make them, often just get ignored for months on end. Now, I have a back up plan. I've registered unseen-japan.com, and I could just move my domain over to the new domain if I can't get this issue resolved. It would allow me to be on a domain with a clean history while not having to change my brand. But if I do that, and I set up 301 redirects from the former domain, will it simply cause the new domain to be perceived as an "adult" domain by Google? I.e., will the former URL's bad reputation carry over to the new one? I haven't made a decision one way or the other yet, so any insights are appreciated.
Intermediate & Advanced SEO | | gaiaslastlaugh0 -
HTTP Pages Indexed as HTTPS
My site used to be entirely HTTPS. I switched months ago so that all links in the pages that the public has access to are now http only. But I see now that when I do a site:www.qjamba.com, the results include many pages with https in the beginning (including the home page!), which is not what I want. I can redirect to http but that doesn't remove https from the indexing, right? How do I solve this problem? sample of results: Qjamba: Free Local and Online Coupons, coupon codes ... **<cite class="_Rm">https://www.qjamba.com/</cite>**One and Done savings. Printable coupons and coupon codes for thousands of local and online merchants. No signups, just click and save. Chicnova online coupons and shopping - Qjamba **<cite class="_Rm">https://www.qjamba.com/online-savings/Chicnova</cite>**Online Coupons and Shopping Savings for Chicnova. Coupon codes for online discounts on Apparel & Accessories products. Singlehop online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/singlehop</cite>Online Coupons and Shopping Savings for Singlehop. Coupon codes for online discounts on Business & Industrial, Service products. Automotix online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/automotix</cite>Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. Online Hockey Savings: Free Local Fast | Qjamba **<cite class="_Rm">www.qjamba.com/online-shopping/hockey</cite>**Find big online savings at popular and specialty stores on Hockey, and more. Hitcase online coupons and shopping - Qjamba **<cite class="_Rm">www.qjamba.com/online-savings/hitcase</cite>**Online Coupons and Shopping Savings for Hitcase. Coupon codes for online discounts on Electronics, Cameras & Optics products. Avanquest online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/avanquest</cite>Online Coupons and Shopping Savings for Avanquest. Coupon codes for online discounts on Software products.
Intermediate & Advanced SEO | | friendoffood0 -
Google is indexing the wrong pages
I have been having problems with Google indexing my website since mid May. I haven't made any changes to my website which is wordpress. I have a page with the title 'Peterborough Cathedral wedding', I search Google for 'wedding Peteborough Cathedral', this is not a competitive search phrase and I'd expect to find my blog post on page one. Instead, half way down page 4 I find Google has indexed www.weddingphotojournalist.co.uk/blog with the title 'wedding photojournalist | Portfolio', what google has indexed is a link to the blog post and not the blog post itself. I repeated this for several other blog posts and keywords and found similar results, most of which don't make any sense at all - A search for 'Menorca wedding photography' used to bring up one of my posts at the top of page one. Now it brings up a post titled 'La Mare wedding photography Jersey" which happens to have a link to the Menorca post at the bottom of the page. A search for 'Broadoaks country house weddng photography' brings up 'weddingphotojournalist | portfolio' which has a link to the Broadoaks post. a search for 'Blake Hall wedding photography' does exactly the same. In this case Google is linking to www.weddingphotojournalist.blog again, this is a page of recent blog posts. Could this be a problem with my sitemap? Or the Yoast SEO plugin? or a problem with my wordpress theme? Or is Google just a bit confused?
Intermediate & Advanced SEO | | weddingphotojournalist0 -
New Web Page Not Indexed
Quick question with probably a straightforward answer... We created a new page on our site 4 days ago, it was in fact a mini-site page though I don't think that makes a difference... To date, the page is not indexed and when I use 'Fetch as Google' in WT I get a 'Not Found' fetch status... I have also used the'Submit URL' in WT which seemed to work ok... We have even resorted to 'pinging' using Pinglar and Ping-O-Matic though we have done this cautiously! I know social media is probably the answer but we have been trying to hold back on that tactic as the page relates to a product that hasn't quite launched yet and we do not want to cause any issues with the vendor! That said, I think we might have to look at sharing the page socially unless anyone has any other ideas? Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Certain Product Pages Not Indexing
Hey All, We discovered an issue where new product pages on our site were not getting indexed because a "noindex" tag was inadvertently being added to section when those pages were created. We removed the noindex tag in late April and some of the pages that had not been previously indexed are now showing up, but others are still not getting indexed and I'd appreciate some help on why this could be. Here is an example of a page that was not in the index but is now showing after removal of noindex: http://www.cloud9living.com/san-diego/gaslamp-quarter-food-tour And here is an example of a page that is still not showing in the index: http://www.cloud9living.com/atlanta/race-a-ferrari UPDATE: The above page is now showing after I manually submitted it in WMT. I had previously submitted another page like a month ago and it was still not indexing so I thought the manual submission was a dead end. However, it just so happens that the above URL just had its Page Title and H1 updated to something more specific and less duplicative so I am currently running a test to see if that's the problem with these pages not indexing. Will update this soon. Any suggestions? Thanks!
Intermediate & Advanced SEO | | GManSEO0 -
Too many links on home page?
I run a forum that currently has 351 links on its home page (and p.1, 2 etc.) due to all the tags that appear under feed items. Is it a viable solution to get this number down to simply nofollow all of these tag links? How else can I get this number of links down recorded by Google down, or is it not something I should be especially worried about?
Intermediate & Advanced SEO | | staingurus0 -
How long till pages drop out of the index
In your experience how long does it normally take for 301-redirected pages to drop out of Google's index?
Intermediate & Advanced SEO | | bjalc20110 -
How do I index these parameter generated pages?
Hey guys, I've got an issue with a site I'm working on. A big chunk of the content (roughly 500 pages) is delivered using parameters on a dynamically generated page. For example: www.domain.com/specs/product?=example - where "example' is the product name Currently there is no way to get to these pages unless you enter the product name into the search box and access it from there. Correct me if I'm wrong, but unless we find some other way to link to these pages they're basically invisible to search engines, right? What I'm struggling with is a method to get them indexed without doing something like creating a directory map type page of all of the links on it, which I guess wouldn't be a terrible idea as long as it was done well. I've not encountered a situation like this before. Does anyone have any recommendations?
Intermediate & Advanced SEO | | CodyWheeler0