Once only considered for major high-traffic sites, since early-2011 Content Distribution Networks (CDNs) have been picking up some traction as several hosting services have begun integrating CDN services directly with their hosting packages to offer to smaller hosting clients. In the recent past, access to CDNs was out of reach for most small operations. The base level of service to get a CDN to notice you was in the hundreds of dollars per month, which was way higher than most small to mid sized organizations could afford. It’s great to have access to CDNs for smaller businesses now, but it’s an added piece in the already complex puzzle that is web hosting, and some clarification is in order as to exactly what a CDN is, why you’d want to use one, how to implement it, and, finally, whether or not it should be implemented.
What is a Content Distribution Network (CDN)?
Let’s start with the “What.” A Content Distribution Network is an extra layer in the delivery process for getting your website from your server to your users. A CDN can deliver the graphics, videos and other files to your website’s visitors while your website’s normal server handles the rest of the website content. The CDN will store a cached version (basically a copy) of the media files for your website. You may already be familiar with caching in your browser (if you are, congratulations! You know more than probably 90% of the people on the internet!) – and this is similar in many ways. Your browser stores files that it thinks it’s going to use multiple times so that you don’t have to download them from the internet every single time you access a web page. Certain graphics, script files, and other resources that go into building a web page will be kept on your machine, and when a website says that they’re needed again (for a new page, for example), your browser says “hey, I’ve already got this” and plugs in the stored version. A CDN does the same thing, except for the server instead of the user. If the server thinks that a particular file is going to be used a lot, it can make a copy to the CDN server and direct traffic there instead.
Why use a CDN?
This has a few key benefits (i.e. the “Why”):
- CDN servers are usually geographically distributed. That means that if my web server is in Texas and my user is in New York, instead of having to get that file from Texas, it can call to a closer server in maybe Maryland or New Jersey. This shaves a few milliseconds off of the time it takes to load the file, which adds up. Half a second may not seem like a lot, but statistics show that shaving even half a second of load time off of your web page can increase usage dramatically.
- It reduces the load on your web server. CDNs are built on extremely robust infrastructure – they’re designed to be extremely good at serving static files. By contrast, your web server probably also needs to be good at assembling dynamic web pages from content in a database, and possibly running the actual database itself. Having the CDN store and serve static files (i.e. files that aren’t built “on the fly” from changing data) means that the web server can concentrate on doing its work of building the web pages.
- Likewise, CDNs are built to be fast. Because of the scale, we can take advantage of the CDN’s faster servers and faster connections.
How does a CDN work?
The How part is pretty technical, but the simplified version goes something like this: when a user visits a website, the various pieces of the site (image, script, stylesheet, static page, etc.) get sent by the site’s server to the user’s browser. However, it ALSO makes a copy of it on the CDN server. Future requests for that same resource will be passed over to the CDN server for a while. This could be 10 minutes, an hour, 12 hours, whatever, depending on how long you want to keep the files stored. For a low traffic site, using a CDN won’t make much difference – if you’re only getting one person on your site an hour, it’s not going to help at all. For a higher traffic site, though, it can mean the difference between the site surviving the traffic and it crashing. Anyone who has experienced the DIGG effect will be familiar with this situation: some piece of your content “goes viral” or is posted on a major social sharing network. Your hosting setup is not configured to handle the traffic. Your site goes down.
Server side caching is one way to handle this issue, but it has limitations. If you’re technologically faint of heart, feel free to skip this paragraph, as it will read like greek to you. But if you’re a website owner and you’ve earned some chops the hard way (by having to learn), or you’re feeling like stretching your knowledge a bit, this may make some sense to you. Server side caching allows a server to take a snapshot of a dynamically created page (or portion of a page) and store it as a flat file. This means that instead of having to hit the database for each page load (which is processor intensive), it just serves up a flat page (processor light). This is good, but you still run into hard limitations in terms of how much bandwidth your server can hog up within its own network. A media heavy site will only be able to stream videos to so many users at a time. A CDN takes it one step further by distributing the responsibility for serving the media across a network.
Should we add a CDN to our hosting setup?
So the short version is that a CDN will typically make your site load faster for users, but how much faster is based on a lot of variables. But as with most things web related, there are some catches. This is where I will discuss the particulars of whether a specific site SHOULD use a CDN. For starters, it costs money to get set up – both for the CDN itself, and also for the time involved in a developer modifying your site.
If your site is a small piece of brochureware, or has very low traffic (a few dozen visitors a day) and is not media intensive (i.e. lots of big graphics, music files, videos), a CDN probably will not do you much good. You’re probably paying less than $10/month for hosting, and the cost of setting your site up for a CDN alone would be several times more than your annual hosting bill.
If your site uses a complex CMS (like WordPress or Drupal), does not require instantaneous updates (instant comments, message boards, etc.), and is not hosted on a powerful enough server, a CDN can significantly improve performance, but probably not much more than server side caching unless you have a lot of traffic (hundreds or thousands of visitors a day).
If your site serves a lot of large media and/or has a lot of traffic, it probably is a good candidate for hooking into a CDN.
If your site deals primarily with content that changes quickly ( i.e. an active blog where people comment a lot, or a messageboard with a lot of traffic), special care should be taken in setting up a CDN. The reason for the extra care is that pages that are cached in a CDN will not reflect updates made to them since the last caching update. So if your CDN cache is set to expire every hour, it will be up to an hour before people see your updates. This isn’t a problem for some of the content of your pages – graphics, script files and style sheets can still safely be cached to a CDN, but not active pages. This requires some fine tuning, but can make a very big difference for site loading time.
Of the hosts that we work with regularly, Rackspace, Media Temple and Dreamhost have all started offering integrated CDNs, and I’m sure others will follow soon too. Each offering is somewhat unique, and all of them have pros and cons to them… and there’s nothing to say that they’re the best option for your site: it’s possible to integrate with CDNs other than the one your hosting provider offers. Every site has its own idiosyncrasies, so it’s impossible to make a recommendation without a huge list of qualifiers that are beyond the scope of this article. Contact your web developer (or us if you don’t have one!) to talk about whether a integrating with a CDN would make sense for your site.
Note: this article glosses over a lot of deeply technical stuff. It’s not meant to be an exhaustive review of the details of CDNs, merely a primer on the concept for less technical site stakeholders.