For large sites, best practices for pages hidden behind internal search?
-
If a website has 1M+ pages, with most of them being hidden behind an internal search, what's the best way to get pages included in an engine's index?
Does a direct clickpath to those pages need to exist from the homepage or other major hub pages on the site?
Is submitting an XML sitemap enough?
-
Hello Vlevit,
You could do several things. I recommend giving Google your product feed, which should accomplish your goals. Another possible solution would be to make those search pages noindex,follow so they don't end up getting indexed, but Google can still use them for discovery.
Thanks for explaining the situation.
Below is more on submitting product feeds. It is for Google Product Search, but I would imagine the "link" field where you put the URL to your product detail page will help those pages get indexed in the standard results:
http://support.google.com/merchants/bin/answer.py?hl=en&answer=188494#USEverett
-
Everett, thanks for your reply. I understand the problems of showing internal search pages. I'm not looking to have internal search results being indexed, just the pages that the results link to. We're in eCommerce.
I was under the impression that there was a clever way to have the individual product pages indexed without establishing a direct click path, but best practices recommend otherwise.
Question answered. Thanks all for your help.
-
Hello Vlevit,
If you can be more specific we may be able to be of more help. Google doesn't want you to show internal search result pages, but if this is a different type of situation it there may be an exception. Are these search result pages, product pages, category pages, content pages.... is it an eCommerce site, community, content site... ?
Generally speaking, 1M+ pages with no links going into them and content that is either sparce/thin or partially/fully duplicated on other similar pages (like a search for widgets and a search for green widgets showing overlapping content) is exactly the type of thing that will get you in hot water that would affect even the rankings of your home page.
Do you feel like your question has been answered or would you like to be more specific about your site and goals?
Cheers,
Everett
-
This is what I was assuming, but was wondering if there was a clever way around creating direct click paths to those pages, while still maintaining their importance to the site. Thanks for the info.
-
Make sure they are part of the actual structure of your website, not just part of search. Meaning, you have to have links pointing at them. Also, you will also want to make sure that those pages have value.
-
Hi vlevit,
The best practice would be to exist a direct path of flow from index page. Something like: index -> category(filter) -> subcategory(filter) -> page/product. But in some cases xml sitemaps can also help you in indexing.
BUT, beware with to large XML sitemaps, try to create more then one sitemap, group them as possible.
A few very good resources can be found under the next links:
http://www.seomoz.org/ugc/solving-new-content-indexation-issues-for-large-b2b-websites
http://www.seomoz.org/qa/view/29009/sitemaps-management-for-big-sites-tens-of-millions-of-pages
I hope it helpes,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best Practices For Angular Single Page Applications & Progressive Web Apps
Hi Moz Community, Is there a proper way to do SPA (client side rendered) and PWA without having a negative impact on SEO? Our dev team is currently trying to covert most of our pages to Angular single page application client side rendered. I told them we should use a prerendering service for users that have JS disabled or use server side rendering instead since this would ensure that most web crawlers would be able to render and index all the content on our pages even with all the heavy JS use. Is there an even better way to do this or some best practices? In terms of the PWA that they want to add along with changing the pages to SPA, I told them this is pretty much separate from SPA's because they are not dependent. Adding a manifest and service worker to our site would just be an enhancement. Also, if we do complete PWA with JS for populating content/data within the shell, meaning not just the header and footer, making the body a template with dynamic JS as well would that effect our SEO in any way, any best practices here as well? Thanks!
Technical SEO | | znotes0 -
How do I influence what page on my site google shows for specific search phrases?
Hi People, My client has a site www.activeadventures.com. They provide adventure tours of New Zealand, South America and the Himalayas. These destinations are split into 3 folders in the site (eg: activeadventures.com/new-zealand, activeadventures.com/south-america etc....). The actual root folder of the site is generic information for all of the destinations whilst the destination specific folders are specific in their information for the destination in question. The Problem: If you search for say "Active New Zealand" or "Adventure Tours South America" our result that comes up is the activeadventures.com homepage rather than the destination folder homepage (eg: We would want activeadventures.com/new-zealand to be the landing page for people searching for "active new zealand"). Are there any ways in influence google as to what page on our site it chooses to serve up? Many thanks in advance. Conrad
Technical SEO | | activenz0 -
Non WWW. versus WWW. versions, current best practice ?
Hi Im increasingly seeing sites not using the www., but understand from various sources including seomoz that best practice is to be on the www. with the non www version 301'd to the www version. Since alot of sites are clearly doing this the other way round now is that better practice or the former still best ? I appreciate that non www version gives you 3 more characters for url's but apart from that is there any benefit over the www. version ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Best practices for controlling link juice with site structure
I'm trying to do my best to control the link juice from my home page to the most important category landing pages on my client's e-commerce site. I have a couple questions regarding how to NOT pass link juice to insignificant pages and how best to pass juice to my most important pages. INSIGNIFICANT PAGES: How do you tag links to not pass juice to unimportant pages. For example, my client has a "Contact" page off of there home page. Now we aren't trying to drive traffic to the contact page, so I'm worried about the link juice from the home page being passed to it. Would you tag the Contact link with a "no follow" tag, so it doesn't pass the juice, but then include it in a sitemap so it gets indexed? Are there best practices for this sort of stuff?
Technical SEO | | Santaur0 -
Search/Search Results Page & Duplicate Content
If you have a page whose only purpose is to allow searches and the search results can be generated by any keyword entered, should all those search result urls be no index or rel canonical? Thanks.
Technical SEO | | cakelady0 -
Can a Joomla template ruin a sites on-page seo?
Have been looking into a potential clients site that performs really badly, when I took a look in 'googlebot view' I see that every on page link appears- [visit camp26.biz] clients link title as expected insures against a negative affect. But having that as the first two words of every link title/anchor in the eyes of Google would seem to be something to be concerned about? Have tried searching for answers to this online but template providers are so prevelant everywhere I can't find any decent information on this issue. If anyone can throw some light on this for me it will be much appreciated : )
Technical SEO | | steve821 -
Should you worry about adding geo-targeted pages to your site?
Post-Panda, should I worry about adding a bunch of geo-targeted landing pages at once? It's a community, people have added their location on their profile pages. I'm worried if we decide to make all the locations into hyperlinks that point to new geo-targeted pages, it could get us extra traffic for those geo-specific keyword phrases but penalize the site as a whole for having so many low-quality pages. What I'm thinking is maybe to start small and turn, say, United States into a hyperlink that points to a page (that would house our community members that reside in the United States) and add extra unique content to the page. And only add a new location page when we know we'll be adding unique content to it, so it's not basically just page sorting. Thoughts? Hope that makes sense. Thanks!
Technical SEO | | poolguy0 -
Does duplicate content on word press work against the site rank? (not page rank)
I noticed in the crawl that there seems to be some duplicate content with my word press blog. I installed a seo plugin, Yoast's wordpress seo plugin, and set it to keep from crawling the archives. This might solve the problem but my main question is can the blog drag my site down?
Technical SEO | | tommr10