For large sites, best practices for pages hidden behind internal search?
-
If a website has 1M+ pages, with most of them being hidden behind an internal search, what's the best way to get pages included in an engine's index?
Does a direct clickpath to those pages need to exist from the homepage or other major hub pages on the site?
Is submitting an XML sitemap enough?
-
Hello Vlevit,
You could do several things. I recommend giving Google your product feed, which should accomplish your goals. Another possible solution would be to make those search pages noindex,follow so they don't end up getting indexed, but Google can still use them for discovery.
Thanks for explaining the situation.
Below is more on submitting product feeds. It is for Google Product Search, but I would imagine the "link" field where you put the URL to your product detail page will help those pages get indexed in the standard results:
http://support.google.com/merchants/bin/answer.py?hl=en&answer=188494#USEverett
-
Everett, thanks for your reply. I understand the problems of showing internal search pages. I'm not looking to have internal search results being indexed, just the pages that the results link to. We're in eCommerce.
I was under the impression that there was a clever way to have the individual product pages indexed without establishing a direct click path, but best practices recommend otherwise.
Question answered. Thanks all for your help.
-
Hello Vlevit,
If you can be more specific we may be able to be of more help. Google doesn't want you to show internal search result pages, but if this is a different type of situation it there may be an exception. Are these search result pages, product pages, category pages, content pages.... is it an eCommerce site, community, content site... ?
Generally speaking, 1M+ pages with no links going into them and content that is either sparce/thin or partially/fully duplicated on other similar pages (like a search for widgets and a search for green widgets showing overlapping content) is exactly the type of thing that will get you in hot water that would affect even the rankings of your home page.
Do you feel like your question has been answered or would you like to be more specific about your site and goals?
Cheers,
Everett
-
This is what I was assuming, but was wondering if there was a clever way around creating direct click paths to those pages, while still maintaining their importance to the site. Thanks for the info.
-
Make sure they are part of the actual structure of your website, not just part of search. Meaning, you have to have links pointing at them. Also, you will also want to make sure that those pages have value.
-
Hi vlevit,
The best practice would be to exist a direct path of flow from index page. Something like: index -> category(filter) -> subcategory(filter) -> page/product. But in some cases xml sitemaps can also help you in indexing.
BUT, beware with to large XML sitemaps, try to create more then one sitemap, group them as possible.
A few very good resources can be found under the next links:
http://www.seomoz.org/ugc/solving-new-content-indexation-issues-for-large-b2b-websites
http://www.seomoz.org/qa/view/29009/sitemaps-management-for-big-sites-tens-of-millions-of-pages
I hope it helpes,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Where and how much; Schema best practices.
Couple of schema questions: Should I 'only' mark up the contact page, as this has the most information? What about the header and footer, should I tag everything there also? If I do mark up the header, footer, and contact page, I end up with 3 "LocalBusiness" entries in Google testing tool, is that bad?
Technical SEO | | MichaelGregory0 -
Similar pages on a site
Hi I think it was at BrightonSEO where PI DataMetrics were talking about similar pages on a website can cause rankings to drop for your main page. This has got me thinking. if we have a category about jumpers so: example.com/jumpers but then our blog has a category about jumpers, where we write all about jumpers etc which creates a category page example.com/blog/category/jumpers, so these blog category pages have no index put on them to stop them ranking in Google? Thanks in Advance for any tips. Andy
Technical SEO | | Andy-Halliday1 -
Question on Google's Site: Search
A client currently has two domains with the same content on each. When I pull up a Cached version of the site, I noticed that it has a Cache of the correct page on it. However, when I do a site: in Google, I am seeing the domain that we don't want Google indexing. Is this a problem? There is no canonical tag and I'm not sure how Google knows to cache the correct website but it does. I'm assuming they have this set in webmaster tools? Any help is much appreciated! Thanks!
Technical SEO | | jeff_46mile0 -
One page of the site disappeared from serp for a month now
Im working on a clients site and been promoting a specific page to a keyword. started to move up the ranks and exactly a month ago on the 19/5 ( on the same day of the last update) updated the main page im working on with new content and published some other new pages on related subjects that all are linking to the main page im working on ( without the same anchor text in the links ) on the same day i found out that because of a technical error the new content was published on 5 other pages of the site and obviously created a duplicate content issue and i removed all the duplicates on the same day , i assume G caught this thing and punished the site for the duplicate content issue but : when i search the page directly with site:...i can find it. its been a month since i fixed all issues that i thought could impact the page..no duplicate content on the site. no KW stuffing. no spammy links to the page. everything seems fine now my question : why is my page not showing ? how long should i wait before giving up and creating a new page .? how come my site has not lost any organic traffic ( apart from that specific page ) ? is it possible to penalize only one page ? can i recover from this at all ? thanks
Technical SEO | | nira0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
Search/Search Results Page & Duplicate Content
If you have a page whose only purpose is to allow searches and the search results can be generated by any keyword entered, should all those search result urls be no index or rel canonical? Thanks.
Technical SEO | | cakelady0 -
Site disappearing from search for a certain keyword
I was wondering if someone has encountered the same problem as me. I was doing some changes on the frontpage of one of my clients' website, especially some redirections, and my site has disappeared from Google for the main keyword on the page. So, if I look for my page on Google, instead of seeing my page first, I no longer see my page, at all. All I've done was a 301 redirection from index.html to the domain name. Now, I changed everything back to how it was before. More precisely, I've done that 2 weeks ago. But, no change in Google. I checked Bing and Yahoo, my site appears first when I search for that specific keyword. Any ideas how long will it take for Google to see that I am not doing anything wrong with redirections? Or any idea at all?
Technical SEO | | webmasterles0