How do we ensure our new dynamic site gets indexed?
-
Just wondering if you can point me in the right direction. We're building a 'dynamically generated' website, so basically, pages don’t technically exist until the visitor types in the URL (or clicks an on page link), the pages are then created on the fly for the visitor.
The major concern I’ve got is that Google won’t be able to index the site, as the pages don't exist until they're 'visited', and to top it off, they're rendered in JSPX, which makes things tricky to ensure the bots can view the content
We’re going to build/submit a sitemap.xml to signpost the site for Googlebot but are there any other options/resources/best practices Mozzers could recommend for ensuring our new dynamic website gets indexed?
-
Hi Ryan,
Mirroring what Alan said, if the links are html text links - and they should be - then you will reduce your crawling problem with Google.
If you must use javascript links, make sure to duplicate them using
<noscript>tags so that Google will follow them.</p> <p><a href="http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355">http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355</a></p> <p>But be careful, Google doesn't treat <noscript> links like regular html links. At best, it's a poor alternative.</p> <p>Google derives so many signals from HTML links (anchor text, page rank, context, etc) that it's almost essential for a search engine friendly site to include them.</p> <p>The Beginners Guide to SEO has a relevant chapter on the basics of Search Engine Friendly Design and Development:</p> <p><a href="http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development">http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development</a></p> <p>Best of luck!</p></noscript>
-
Definitely want to get it right before launch. It's not going anywhere until it is absolutely ready!
-
The project this reminds me of took six months to complete and the 301's alone were a full time job.
Get it right the first time... you do not want to restructure like this on a large dynamic site.
I must say the project worked out but I got all my grey hair the day we threw the switch...
-
When I say its costly to rewrite 200,000+ URLS I mean it. Correcting mistakes here can cost big dollars.
In this case it wascostly to the tune of $60,000+ in costs and loss, however the bottle of bubbly at the end of the six month project was tasty.
Point being is to do it right the first time.
As I said before your best bet is documentation. Large dynamic sites generate large dynamic problems very quickly if not watched closely.
-
Thank you Khem, very helpful replies.
-
One more thing, I missed. Internal linking, make sure each of the page is linked with some text link. But avoid over linking. don't try to link all the pages from home page. Generally we links all the categories, pages from footer or site-wide links
-
Okay, lets do it step by step.
First, if it's a product website, create a separate feed for products and submit the sitemap with Google.
if not, that may you would have separate news/articles/videos sections, create separate xml sitemap for each section and submit with Google
If not, make sure to have only search engine friendly URLs, who says rewriting 200,000+ pages is costly, compare this cost with the business you'll loose when all your products would be listed in Google. So, make sure to rewrite all the dynamic URLs, if you feel that Google might face problem in crawling your website's URLs
Second, study webmaster tool's data very carefully for warnings, errors, so that you can figure out the issues which Google might have been facing while visits your websites.
Avoid duplicate entries of products, generally we don't pay attention to these things, and show same products on different pages in different categories. Google will filter all those duplicate pages, and can even penalize your website because of the duplicate content issue.
Third, keep promoting, but avoid grey/black hat techniques, there is no shortcut to the success. you'll have to spend time and money.
-
It's definitely something we're taking a very close look at. Another thing not mentioned is the use of canonical tags to head off duplicate content issues, which I'll be ensuring is implemented.
My next mugshot might have significantly grayer hair after this is all done...
-
Thanks very much for the replies.
I'll ensure proper cross linking from navigation, on pages themselves and submit a full XML sitemap, along with the social media options suggested. My other concern is that the content itself won't be visible to Googlebot due to the site being largely javascript driven, but that's something I'm working with the developers to resolve.
-
As you can tell from the response above indexation is not what you should be worried about.
Dynamic content is not fool proof. The mistakes are costly and you never want to be involved rewriting 200,000+ pages of dynamic rats nest.
Sorting abilities can cause dynamic urls and duplicate content.
Structure changes or practice changes can cause crawl errors. I looked at a report for a client early today that had 3000+ errors today compared to 20 last week. This was all due to a request made by the owner to the developer.
When enough attention is not paid to this stuff it causes real issues.
The best advice I can offer is to make sure you have a best practices document that must be followed by all developers.
-
Make sure every page you would like to be crawled is linked to in any matter. You can create natural links to them, e.g. from your navigation or in text links, or you can put them in a sitemap.
You can also link to these pages from websites like facebook, twitter to have fast crawling.
Tell Google in your robots.txt that it can access your website and make sure non of the pages you would like to be indexed carry the noindex-value in the robots meta-tag.
Good luck!
-
any link, but i should correct what i said, they will be crawled, not necessary indexwed
-
Thanks for the reply Alan, do you mean links from the sitemap?
-
If you have links to the pages they will be indexed, dynamic of static it does not matter
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexing Issue of Dynamic Pages
Hi All, I have a query for which i am struggling to find out the answer. I unable to retrieve URL using "site:" query on Google SERP. However, when i enter the direct URL or with "info:" query then a snippet appears. I am not able to understand why google is not showing URL with "site:" query. Whether the page is indexed or not? Or it's soon going to be deindexed. Secondly, I would like to mention that this is a dynamic URL. The index file which we are using to generate this URL is not available to Google Bot. For instance, There are two different URL's. http://www.abc.com/browse/ --- It's a parent page.
Technical SEO | | SameerBhatia
http://www.abc.com/browse/?q=123 --- This is the URL, generated at run time using browse index file. Google unable to crawl index file of browse page as it is unable to run independently until some value will get passed in the parameter and is not indexed by Google. Earlier the dynamic URL's were indexed and was showing up in Google for "site:" query but now it is not showing up. Can anyone help me what is happening here? Please advise. Thanks0 -
Inner pages of a directory site wont index
I have a business directory site thats been around a long time but has always been split into two parts, a subdomain and the main domain. The subdomain has been used for listings for years but just recently Ive opened up the main domain and started adding listings there. The problem is that none of the listing pages seem to be betting indexed in Google. The main domain is indexed as is the category page and all its pages below that eg /category/travel but the actual business listing pages below that will not index. I can however get them to index if I request Google to crawl them in search console. A few other things: I have nothing blocked in the robots.txt file The site has a DA over 50 and a decent amount of backlinks There is a sitemap setup also any ideas?
Technical SEO | | linklander0 -
Is site: a reliable method for getting full list of indexed pages?
The site:domain.com search seems to show less pages than it used to (Google and Bing). It doesn't relate to a specific site but all sites. For example, I will get "page 1 of about 3,000 results" but by the time I've paged through the results it will end and change to "page 24 of 201 results". In that example If I look in GSC it shows 1,932 indexed. Should I now accept the "pages" listed in site: is an unreliable metric?
Technical SEO | | bjalc20112 -
New site luanched- rankings plummeted
Hi all, we just launched a new version of our site and have seen huge drops in traffic to the homepage. Url structures were changed and now include /us at the end and a lot of the content was consolidated to make the homepage a little cleaner. We kept title tags and meta data the same. Rankings have declined significantly. Does anyone have any suggestions or advice?
Technical SEO | | perfectsearch711 -
New Site maintaining rank on old URL's
Hi I have a new website going live which has a different page names etc i.e. the old site had pages that are ranking called aboutus.html and the new site is called about.php What is the best approach to maintain the rank and also on orphaned pages Many Thanks
Technical SEO | | ocelot0 -
Indexation question
Hi Guys, i have a small problem with our development website. Our development website is website.dev.website.nl This page shouldn't be indexed bij Google but unfortunately it is. What can i do to deindex it and ask google not to index this website. In the robots.txt or are there better ways to do this? Kind regards Ruud
Technical SEO | | RuudHeijnen0 -
301 an old site to a newer site...
Hi First, to be upfront - these are not my websites, I'm asking because they are trying to compete in my niche. Here's the details, then the questions... There is a website that is a few months old with about 200 indexed pages and about 20 links, call this newsite.com There is a website that is a few years old with over 10,000 indexed pages and over 20,000 links, call this oldsite.com newsite.com acquired oldsite.com and set a 301 redirect so every page of oldsite.com is re-directed to the front page of newsite.com newsite.com & oldsite.com are on the same topic, the 301 occurred in the past week. Now, oldsite.com is out of the SERPs and newsite.com is pretty much ranking in the same spot (top 10) for the main term. Here are my questions; 1. The 10,000 pages on oldsite.com had plenty of internal links - they no longer exists, so I imagine when the dust settles - it will be like oldsite.com is a one page site that re-diretcts to newsite.com ... How long will a ranking boost last for? 2. With the re-direct setup to completely forget about the structure and content of oldsite.com, it's clear to me that it was setup to pass the 'Link Juice' from oldsite.com to newsite.com ... Do the major SE's see this as a form of SPAM (manipulating the rankings), or do they see it as a good way to combine two or more websites? 3. Does this work? Is everybody doing it? Should I be doing it? ... or are there better ways for me to combat this type of competition (eg we could make a lot of great content for the money spent buying oldsite.com - but we certainly wouldn't get such an immediate increase to traffic)?
Technical SEO | | RR5000