Adding Orphaned Pages to the Google Index
-
Hey folks,
How do you think Google will treat adding 300K orphaned pages to a 4.5 million page site. The URLs would resolve but there would be no on site navigation to those pages, Google would only know about them through sitemap.xmls.
These pages are super low competition.
The plot thickens, what we are really after is to get 150k real pages back on the site, these pages do have crawlable paths on the site but in order to do that (for technical reasons) we need to push these other 300k orphaned pages live (it's an all or nothing deal)
a) Do you think Google will have a problem with this or just decide to not index some or most these pages since they are orphaned.
b) If these pages will just fall out of the index or not get included, and have no chance of ever accumulating PR anyway since they are not linked to, would it make sense to just noindex them?
c) Should we not submit sitemap.xml files at all, and take our 150k and just ignore these 300k and hope Google ignores them as well since they are orhpaned?
d) If Google is OK with this maybe we should submit the sitemap.xmls and keep an eye on the pages, maybe they will rank and bring us a bit of traffic, but we don't want to do that if it could be an issue with Google.
Thanks for your opinions and if you have any hard evidence either way especially thanks for that info.
-
it's not a strategy, it's due to technical limitations on the dev side. i agree though thanks.
So, I asked this question to a very advanced SEO guru and he said they could be seen as doorways and present some risk and advised against it. That combined with the probability that they will most likely get dropped from Google's index anyway and we know that Google says they want pages to be part of the sites architecture has me leaning towards nofollowing all of them and maybe experiment with allowing 1000 to get indexed and see what happens with them.
Thanks for your input folks
-
I'd go back to the drawing board and rework your strategy.
Do you need additional sites? 150K orphaned pages you want indexed sounds spammy or poor site architecture to me.
-
Yikes, I didn't know the site was that big. Still, if you're afraid of how Google would "react" to those orphaned pages, I'd still test small, regardless of how large your overall site is.
-
Yea 1000 is probably a big enough sample.
10,000 seems like a lot i guess but not when you've got a site with 4.5 million pages.
-
yea submitting sitemap.xml files for 300k pages that are not part of the site seems a bit obnoxious.
-
we definitely want the 150k in the index since they are legitimate pages and linked to on the site. it's the 300k of orphaned ones we have to take along as a package deal that i am worried about. too many orphaned pages for Google.
-
That's a good idea. 10,000 Is still a lot. You could even test fewer than 10,000 pages. Why not try 1,000?
-
Hmmm. I am leaning towards the following solution since I would rather be on the cautious side, maybe this makes sense?
a) we noindex these 300k orphaned pages and do not submit sitemap.xml files
b) we experiment with say 10,000 pages and we allow only those to get indexed and submit sitemap.xml files for them
c) we closely monitor their indexing and ranking performance so we can determine if these are even worth opening up to Google and taking any risk.
-
In my opinion, add the 150k pages in the site map along with the 300k pages, let Google index all the pages and once they are all indexed , you can take a call on de indexing the 150k pages based on their traction.
-
I have no hard evidence, but if it were my site, I would do option C but keep an eye on what happens, and if I noticed anything strange happening, I would implement option B. But if option C makes you nervous, I see no reason you couldn't or shouldn't noindex them right off the bat.
That's merely one person's opinion, however.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I worry about rendering problems of my pages in google search console fetch as google?
Some elements are not properly shown when I preview our pages in search console (fetch as google), e.g.
Intermediate & Advanced SEO | | lcourse
google maps, css tables etc. and some parts are not showing up since we load them asynchroneously for best page speed. Is this something should pay attention to and try to fix?0 -
Homepage meta title not indexing correctly on google
Hello everyone! We're having a spot of trouble with our website www.whichledlight.com The meta title is coming up wrong on google. In Google it currently reads out
Intermediate & Advanced SEO | | TrueluxGroup
'Which LED Light: LED Bulbs & Lamps Compared'
when it should be
'LED Bulbs & Lamps Compared | Which LED Light' Last snapshot of the page from google was yesterday (5th April 2016) Anyone got any ideas?
Is all the markup correct in the ?0 -
HTTP Pages Indexed as HTTPS
My site used to be entirely HTTPS. I switched months ago so that all links in the pages that the public has access to are now http only. But I see now that when I do a site:www.qjamba.com, the results include many pages with https in the beginning (including the home page!), which is not what I want. I can redirect to http but that doesn't remove https from the indexing, right? How do I solve this problem? sample of results: Qjamba: Free Local and Online Coupons, coupon codes ... **<cite class="_Rm">https://www.qjamba.com/</cite>**One and Done savings. Printable coupons and coupon codes for thousands of local and online merchants. No signups, just click and save. Chicnova online coupons and shopping - Qjamba **<cite class="_Rm">https://www.qjamba.com/online-savings/Chicnova</cite>**Online Coupons and Shopping Savings for Chicnova. Coupon codes for online discounts on Apparel & Accessories products. Singlehop online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/singlehop</cite>Online Coupons and Shopping Savings for Singlehop. Coupon codes for online discounts on Business & Industrial, Service products. Automotix online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/automotix</cite>Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. Online Hockey Savings: Free Local Fast | Qjamba **<cite class="_Rm">www.qjamba.com/online-shopping/hockey</cite>**Find big online savings at popular and specialty stores on Hockey, and more. Hitcase online coupons and shopping - Qjamba **<cite class="_Rm">www.qjamba.com/online-savings/hitcase</cite>**Online Coupons and Shopping Savings for Hitcase. Coupon codes for online discounts on Electronics, Cameras & Optics products. Avanquest online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/avanquest</cite>Online Coupons and Shopping Savings for Avanquest. Coupon codes for online discounts on Software products.
Intermediate & Advanced SEO | | friendoffood0 -
Best practice to prevent pages from being indexed?
Generally speaking, is it better to use robots.txt or rel=noindex to prevent duplicate pages from being indexed?
Intermediate & Advanced SEO | | TheaterMania0 -
Can links indexed by google "link:" be bad? or this is like a good example by google
Can links indexed by google "link:" be bad? Or this is like a good example shown by google. We are cleaning our links from Penguin and dont know what to do with these ones. Some of them does not look quality.
Intermediate & Advanced SEO | | bele0 -
Google Re-Index or multiple 301 Redirects on the server?
Over a year ago we moved a site from Blogspot that was adding dates in the URL's (i.e.. blog/2012/08/10/) Additionally we've removed category folders (/category, /tag, etc). Overall if I add all these redirects (from the multiple date options, etc) I'm concerned it might be an overload on the server? After talking with the server team they had suggested using something like 'BWP Google Sitemaps' on our Wordpress site, which would allow Google some time to re-index our site. What do you suggest we do?
Intermediate & Advanced SEO | | seointern0 -
Getting Google in index but display "parent" pages..
Greetings esteemed SEO experts - I'm hunting for advice: We operate an accommodation listings website. We monetize by listing position in search results, i.e. you pay more to get higher placing in the page. Because of this, while we want individual detailed listing pages to be indexed to get the value of the content, we don't really want them appearing in Google search results. We ideally want the "content value" to be attributed to the parent page - and google to display this as the link in the search results instead of the individual listing. Any ideas on how to achieve this?
Intermediate & Advanced SEO | | AABAB0 -
Working out exactly how Google is crawling my site if I have loooots of pages
I am trying to work out exactly how Google is crawling my site including entry points and its path from there. The site has millions of pages and hundreds of thousands indexed. I have simple log files with a time stamp and URL that google bot was on. Unfortunately there are hundreds of thousands of entries even for one day and as it is a massive site I am finding it hard to work out the spiders paths. Is there any way using the log files and excel or other tools to work this out simply? Also I was expecting the bot to almost instantaneously go through each level eg. main page--> category page ---> subcategory page (expecting same time stamp) but this does not appear to be the case. Does the bot follow a path right through to the deepest level it can/allowed to for that crawl and then returns to the higher level category pages at a later time? Any help would be appreciated Cheers
Intermediate & Advanced SEO | | soeren.hofmayer0