For large sites, best practices for pages hidden behind internal search?
-
If a website has 1M+ pages, with most of them being hidden behind an internal search, what's the best way to get pages included in an engine's index?
Does a direct clickpath to those pages need to exist from the homepage or other major hub pages on the site?
Is submitting an XML sitemap enough?
-
Hello Vlevit,
You could do several things. I recommend giving Google your product feed, which should accomplish your goals. Another possible solution would be to make those search pages noindex,follow so they don't end up getting indexed, but Google can still use them for discovery.
Thanks for explaining the situation.
Below is more on submitting product feeds. It is for Google Product Search, but I would imagine the "link" field where you put the URL to your product detail page will help those pages get indexed in the standard results:
http://support.google.com/merchants/bin/answer.py?hl=en&answer=188494#USEverett
-
Everett, thanks for your reply. I understand the problems of showing internal search pages. I'm not looking to have internal search results being indexed, just the pages that the results link to. We're in eCommerce.
I was under the impression that there was a clever way to have the individual product pages indexed without establishing a direct click path, but best practices recommend otherwise.
Question answered. Thanks all for your help.
-
Hello Vlevit,
If you can be more specific we may be able to be of more help. Google doesn't want you to show internal search result pages, but if this is a different type of situation it there may be an exception. Are these search result pages, product pages, category pages, content pages.... is it an eCommerce site, community, content site... ?
Generally speaking, 1M+ pages with no links going into them and content that is either sparce/thin or partially/fully duplicated on other similar pages (like a search for widgets and a search for green widgets showing overlapping content) is exactly the type of thing that will get you in hot water that would affect even the rankings of your home page.
Do you feel like your question has been answered or would you like to be more specific about your site and goals?
Cheers,
Everett
-
This is what I was assuming, but was wondering if there was a clever way around creating direct click paths to those pages, while still maintaining their importance to the site. Thanks for the info.
-
Make sure they are part of the actual structure of your website, not just part of search. Meaning, you have to have links pointing at them. Also, you will also want to make sure that those pages have value.
-
Hi vlevit,
The best practice would be to exist a direct path of flow from index page. Something like: index -> category(filter) -> subcategory(filter) -> page/product. But in some cases xml sitemaps can also help you in indexing.
BUT, beware with to large XML sitemaps, try to create more then one sitemap, group them as possible.
A few very good resources can be found under the next links:
http://www.seomoz.org/ugc/solving-new-content-indexation-issues-for-large-b2b-websites
http://www.seomoz.org/qa/view/29009/sitemaps-management-for-big-sites-tens-of-millions-of-pages
I hope it helpes,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you use a seperate url for a interior product page on a site?
I have a friend that has a health insurance agency site. He wants to add a new page, for child health care insurance to his existing site. But the issue is, he brought a new URL; insurancemykidnow.com and he want's to use it for the new page. Now, I'm not sure I'm right on this, but I don't think that can be done? I'm I wrong? = Thanks in advance.
Technical SEO | | Coppell0 -
How do I direct users to site page when they search vanity URL?
My company runs a contest via a landing page on our website. The full URL to the landing page is rather long so we have a vanity URL that we use for advertising purposes. I have a 301 on the vanity URL to the landing page URL so people visiting it directly end up where they should just fine. But if a user goes to Google and types the vanity URL into the search bar, the landing page is nowhere to be found in the results. What do I need to do to get the landing page to show in results when people search the vanity URL?
Technical SEO | | jarjarjarvis0 -
Merging two sites into a new one: best way?
Hi, I have one small blog on a specific niche and let's call it firstsite.com (.com extension) and it's hosted on my server. I am going to takeover a second blog on same niche but with lots more links, posts, authority and traffic. But it his on a .info domain and let's call it secondsite.info and for now it's on a different server. I have a third domain .com where I would like join both blogs. Domain is better and reflects niche better and let's call it thirdsite.com How should I proceed to have the best result? I was thinking of creating a new account at my server with domain thirdsite.com After that upload all content from secondsite.info and go to google webmaster to let they know that site now sits on a new domain. Also do a full 301 redirect. Should it be page by page or just one 301 redirect? And finally insert posts (they are not many) from firstsite.com on thirdsite.com and do specific redirects. Is this a good option? Or should I first move secondsite.info to my server and keep updating it and only a few weeks later make transition to thirdsite.com? I am worried that it could be too much changes at once.
Technical SEO | | delta440 -
Do you need an on page site map as well as an XML Sitemap?
Do on page site maps help with SEO or are they more for user experience? We submit and update our XML Sitemaps for the search engines but wondering if /sitemap for users is necessary?
Technical SEO | | bonnierSEO0 -
Used Machines Site - Should I delete sold machines pages?
I´ll start posting in a wordpress blog used machines (excavators and wheel loaders) from a local company. They buy used machines every day and sell machines every day also. When They sold a machine should I: A) Edit the page with a sold machine text and show similar machines? B) Change the category of the post ( to something like sold machines)? C) Make a 301 redirection to one similar machine? D) Keep the page and Make a 301 redirection to one similar machine? E) Delete the page and Make a 301 redirection to one similar machine? F) Keep the page and Make a 301 redirection to top10 machines? G) Delete the page and Make a 301 redirection to top10 machines? F) Any Suggestions are welcome 🙂
Technical SEO | | SeoMartin10 -
Merging several sites into one - best practice
I had 2 sites on the web (www.physicseditor.de, www.texutrepacker.com) and decided to move them all under one single domain (www.codeandweb.com) Both sites were ranking very good for several keywords. I not redirected the most important pages from the old domains with a 301 redirect to the new subpages (www.texturepacker.com => www.codeandweb.com/texturepacker) Google still delivers the old domains but the redirect take people directly to the new content. I've already submitted the new site map to google webmaster tools. Pages are already in the index but do not really show up in the search results. How long does it take until google accepts the new domain and delivers the new content in the search results? Was it ok what I did? Or is there some room for improvement? SeoMoz will of course not find any information about the new page since it is not yet directly linked in google. But I can't get ranking information for the "old" pages since SeoMoz tells me that it can't crawl the old domains....
Technical SEO | | gossi740 -
Dealing with Dead Pages on an Ecommerce Site
Hello everyone! I'm working on a project for a small jewelry store. They have a store in North Carolina and an ecommerce site (on Shopify - which I loathe!). I'm not exactly an SEO expert, but the client likes the way I handle social media and I know enough to get them much farther down the road than they are now. The big problem is that most everything sold is handmade and one of a kind. So, the site has LOTS of dead links. I'd love everyone's suggestions on how to: Best avoid this in the first place as new products are added and promoted via Facebook, Twitter, blog posts and so on Suggestions for managing the sold items - I don't think it seems wise to leave them up as "SOLD" The site is http://www.laurajamesjewelry.com I'm grateful for your assistance! And look forward to sharpening my SEO skills. ~Robin
Technical SEO | | RobinBertelsen0 -
Search Engine Blocked by Robot Txt warnings for Filter Search result pages--Why?
Hi, We're getting 'Yellow' Search Engine Blocked by Robot Txt warnings for URLS that are in effect product search filter result pages (see link below) on our Magento ecommerce shop. Our Robot txt file to my mind is correctly set up i.e. we would not want Google to index these pages. So why does SeoMoz flag this type of page as a warning? Is there any implication for our ranking? Is there anything we need to do about this? Thanks. Here is an example url that SEOMOZ thinks that the search engines can't see. http://www.site.com/audio-books/audio-books-in-english?audiobook_genre=132 Below are the current entries for the robot.txt file. User-agent: Googlebot
Technical SEO | | languedoc
Disallow: /index.php/
Disallow: /?
Disallow: /.js$
Disallow: /.css$
Disallow: /checkout/
Disallow: /tag/
Disallow: /catalogsearch/
Disallow: /review/
Disallow: /app/
Disallow: /downloader/
Disallow: /js/
Disallow: /lib/
Disallow: /media/
Disallow: /.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /utm
Disallow: /var/
Disallow: /catalog/
Disallow: /customer/
Sitemap:0