How do the Quoras of this world index their content?
-
I am helping a client index lots and lots of pages, more than one million pages. They can be seen as questions on Quora. In the Quora case, users are often looking for the answer on a specific question, nothing else.
On Quora there is a structure setup on the homepage to let the spiders in. But I think mostly it is done with a lot of sitemaps and internal linking in relevancy terms and nothing else... Correct? Or am I missing something?
I am going to index about a million question and answers, just like Quora. Now I have a hard time dealing with structuring these questions without just doing it for the search engines. Because nobody cares about structuring these questions. The user is interested in related questions and/or popular questions, so I want to structure them in that way too.
This way every question page will be in the sitemap, but not all questions will have links from other question pages linking to them. These questions are super longtail and the idea is that when somebody searches this exact question we can supply them with the answer (onpage will be perfectly optimised for people searching this question). Competition is super low because it is all unique user generated content.
I think best is just to put them in sitemaps and use an internal linking algorithm to make the popular and related questions rank better. I could even make sure every question has at least one other page linking to it, thoughts?
Moz, do you think when publishing one million pages with quality Q/A pages, this strategy is enough to index them and to rank for the question searches? Or do I need to design a structure around it so it will all be crawled and each question will also receive at least one link from a "category" page.
-
Wow, that is insane right?
https://www.quora.com/sitemap/questions?page_id=50
I wonder how long this carries on.
-
Quora don't seem to have a XML sitemap but a HTML one :
[https://www.quora.com/robots.txt](https://www.quora.com/robots.txt) refers to [https://www.quora.com/sitemap](https://www.quora.com/sitemap)
-
Yes there are many challenges and external linking is definitely one of them.
What do you think about sitemaps to get this longtail indexed? I think that a lot can be indexed by submitting the sitemaps.
-
There are many challenges to building a really large site. Most of them are related to building the site, but one that often kills the success of the site is the ability to get the pages into the index and keep them there. This requires a steady flow of spiders into the deepest pages of the site. If you don't have continuous and repetitive spider flow the pages will be indexed, but then forgotten, before the spiders return.
An effective way to get deep spidering is have powerful links permanently connected to many deep hub pages throughout the site. This produces a flow of spiders into the site and forces them to chew their way out, indexing pages as they go. These links must be powerful or the spiders will index a couple of pages and die. These links must be permanent because if they are removed the flow of spiders will stop and pages in the index will be forgotten.
The goal of the hub pages is to create spider webs through the site that allow spiders to index all of the pages on short link paths, rather than requiring the spiders to crawl through long tunnels of many consecutive links to get everything indexed.
Lots of people can build a big site, but only some of those people have the resources to get the powerful, permanent links that are required to get the pages indexed and keep them in the index. You can't rely on internal links alone for the powerful, permanent links because most spiders that enter any site come from external sources rather than spontaneously springing up deep in the bowels of your website.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap and content question
This is our primary sitemap https://www.samhillbands.com/sitemaps/sitemap.xml We have a about 750 location based URL's that aren't currently linked anywhere on the site. https://www.samhillbands.com/sitemaps/locations.xml Google is indexing most of the URL because we submitted the locations sitemap directly for indexing. Thoughts on that? Should we just create a page that contains all of the location links and make it live on the site? Should we remove the locations sitemap from separate indexing...because of duplicate content? #
Intermediate & Advanced SEO | | brianvestSitemap Type Processed Issues Items Submitted Indexed --- --- --- --- --- --- --- --- --- 1 /sitemaps/locations.xml Sitemap May 10, 2016 - Web 771 648 2 /sitemaps/sitemap.xml Sitemap index May 8, 2016 - Web 862 730
0 -
Question about Indexing of /?limit=all
Hi, i've got your SEO Suite Ultimate installed on my site (www.customlogocases.com). I've got a relatively new magento site (around 1 year). We have recently been doing some pr/seo for the category pages, for example /custom-ipad-cases/ But when I search on google, it seems that google has indexed the /custom-ipad-cases/?limit=all This /?limit=all page is one without any links, and only has a PA of 1. Whereas the standard /custom-ipad-cases/ without the /? query has a much higher pa of 20, and a couple of links pointing towards it. So therefore I would want this particular page to be the one that google indexes. And along the same logic, this page really should be able to achieve higher rankings than the /?limit=all page. Is my thinking here correct? Should I disallow all the /? now, even though these are the ones that are indexed, and the others currently are not. I'd be happy to take the hit while it figures it out, because the higher PA pages are what I ultimately am getting links to... Thoughts?
Intermediate & Advanced SEO | | RobAus0 -
Blog Content In different language not indexed - HELP PLEASE!
I have an ecommerce site in English and a blog that is in Malay language. We have started the blog 3 weeks ago with about 20-30 articles written. Ecommerce is using MAgento CMS and Blog is wordpress. URL Structure: Ecommerce: www.example.com Blog: www.example.com/blog Blog category: www.example.com/blog/category/ However, google is indexing all pages including blog category but not individual post that is in Malay language. What could be the issue here? PLEASE help me!
Intermediate & Advanced SEO | | WayneRooney0 -
I'm updating content that is out of date. What is the best way to handle if I want to keep old content as well?
So here is the situation. I'm working on a site that offers "Best Of" Top 10 list type content. They have a list that ranks very well but is out of date. They'd like to create a new list for 2014, but have the old list exist. Ideally the new list would replace the old list in search results. Here's what I'm thinking, but let me know if you think theres a better way to handle this: Put a "View New List" banner on the old page Make sure all internal links point to the new page Rel=canonical tag on the old list pointing to the new list Does this seem like a reasonable way to handle this?
Intermediate & Advanced SEO | | jim_shook0 -
Best practice for expandable content
We are in the middle of having new pages added to our website. On our website we will have a information section containing various details about a product, this information will be several paragraphs long. we were wanting to show the first paragraph and have a read more button to show the rest of the content that is hidden. Whats googles view on this, is this bad for seo?
Intermediate & Advanced SEO | | Alexogilvie0 -
Opinion on Duplicate Content Scenario
So there are 2 pest control companies owned by the same person - Sovereign and Southern. (The two companies serve different markets) They have two different website URLs, but the website code is actually all the same....the code is hosted in one place....it just uses an if/else structure with dynamic php which determines whether the user sees the Sovereign site or the Southern site....know what I am saying? Here are the two sites: www.sovereignpestcontrol.com and www.southernpestcontrol.com. This is a duplicate content SEO nightmare, right?
Intermediate & Advanced SEO | | MeridianGroup0 -
Duplicate content on the same page--is this an issue?
We are transitioning to responsive design and some of our pages will not scale properly, so we were thinking of adding the same content twice to the same URL (one would be simple text -- for mobile and the other would include the images, etc for the desktop version), and content would change based on size of the screen. I'm not looking for another technical solution (I know google specifies that you can dynamically serve different content based on user agent)--I am wondering if any one knows if having the same exact content appear twice on the same URL will cause a problem with SEO (any historical tests or experience would be great). Thank you in advance.
Intermediate & Advanced SEO | | nicole.healthline0 -
Diagnosing duplicate content issues
We recently made some updates to our site, one of which involved launching a bunch of new pages. Shortly afterwards we saw a significant drop in organic traffic. Some of the new pages list similar content as previously existed on our site, but in different orders. So our question is, what's the best way to diagnose whether this was the cause of our ranking drop? My current thought is to block the new directories via robots.txt for a couple days and see if traffic improves. Is this a good approach? Any other suggestions?
Intermediate & Advanced SEO | | jamesti0