How was cdn.seomoz.org configured?

mcglynn

The SEOmoz CDN appears to have a "pull zone" that is set to the root of the domain, such that any static file can be addressed from either subdomain:

http://www.seomoz.org/q/moz_nav_assets/images/logo.png

http://cdn.seomoz.org/q/moz_nav_assets/images/logo.png

The risk of this configuration is that web pages (not just images/CSS/JS) also get cached and served by the CDN. I won't put the URL here for fear of Google indexing it, but if you replace the 'www' in the URL below with 'cdn', you'll see a cached copy of the original:

http://www.seomoz.org/ugc/the-greatest-attribution-ever-graphed

The worst-case scenario is that the homepage gets indexed. But this doesn't happen here:

http://cdn.seomoz.org/

That URL issues a 301 redirect back to the canonical www subdomain. As it should.

Here's my question: how was that done?

Because maxcdn.com can't do it. If you set a "pull zone" to your entire domain, they'll cache your homepage and everything else. googlebot has a field day with that; it will reindex your entire site off the CDN.

Maybe the SEOmoz CDN provider (CloudFront) allows specific URLs to be blocked? Or do you detect the CloudFront IPs and serve them a 301 (which they'd proxy out to anyone requesting cdn.seomoz.org)?

One solution is to create a pull zone that points to a folder, like example.com/images... but this doesn't help a complex site that has cacheable content in multiple places (do you Wordpress users really store ALL your static content under /wp-content/ ?).

Or, as suggested above, dynamically detect requests from the CDN's proxy servers, and give them a 301 for any HTML-page request. This gets complex quickly, and is both prone to breakage and very difficult to regression-test.

Properly retrofitting a complex site to use a CDN, without creating a half-dozen new CDN subdomains, does not appear to be easy.

DiamondJewelryEmpire

its a SEOmoz secret...

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How was cdn.seomoz.org configured?

Browse Questions

Explore more categories

Related Questions

Suggested Screaming Frog configuration to mirror default Googlebot crawl?

Anyone have a good process for Schema.org auditing?

Exact match .org Ecommerce: Reason why internal page is ranking over home page

We used to speak of too many links from same C block as bad, have CDN's like CloudFlare made that concept irrelevant?

How complex or what to consider when moving from a .aspx webdeveloper to my own wordpress.org website?

How can I export SEOmoz ranking reports to google spreadsheet

Schema.org on Youtube iframe embed?

SEOMOZ duplicate page result: True or false?