Why you probably shouldn't use cookies to store session data

Tuesday August 12, 2008

Browser cookies have a very specific purpose: they are meant to store tiny bits of data and to pass that data to the web server on every request. Because they must be uploaded to the server along with every single request the browser makes, browsers limit cookies to a maximum size of 4 kilobytes per domain.

Given this storage limitation and the fact that even a 4KB cookie is absolutely guaranteed to increase client/server latency, you’d think web developers would avoid packing cookies full of data, right? Sadly, web developers are a silly bunch and are prone to sheeplike behavior.

Standard practice up until about 2007 was to store session data on the server, typically in a file, a database, or a distributed memory cache. Cookies were used only to store the session keys — short strings that uniquely identified a user and allowed the server to retrieve session data from the session store as necessary. This had little effect on client/server latency since the cookies were small, and it was reasonably secure since users couldn’t tamper with the actual contents of their session data; the best a malicious user could hope to do was to sniff out another user’s session id and impersonate them, which could be easily prevented using SSL.

Then, in 2007, Ruby on Rails added support for using cookies themselves as session stores. Rails wasn’t the first web framework to try to shove a bunch of crap into cookies, but it was possibly the most popular one to do so. Naturally this feature was enabled by default, which meant that web developers around the world instantly began to abuse the hell out of it.

The justification for the feature — and for all the added complexity involved in verifying that users haven’t tampered with the data inside their cookies — was that “cookie-based sessions are dramatically faster than the alternatives”¹. This is undoubtedly true if you invite your users to come to your data center and plug their computers into the local network before using your website. But if, as I suspect is often the case, your users are using your website via DSL, Cable, or — God forbid — dialup, this is almost certainly a naughty naughty lie.

Performance problems

According to research carried out by the Yahoo! Exceptional Performance team, a 1000-byte cookie adds 16ms to the response time of a single request made over a broadband DSL connection². Now imagine a typical web page that includes one external CSS file, two external JavaScript files, and five images, all hosted on the same domain. That’s a total of nine requests, which means that little 1000-byte cookie has added 144ms to the response time of the page (and that’s for broadband users, a best-case scenario).

“Pish posh!” you say. “144ms is nothing! The snap of a finger! The blink of an eye!” Nevertheless, Amazon found that a 100ms increase in response time resulted in a 1% decrease in sales, and Google found that a 500ms increase resulted in a 20% decrease in traffic³. In my day job at Yahoo! Search I’ve seen these findings verified again and again. The tiniest change in response time can have significant effects on abandonment, user satisfaction, and overall traffic.

Security problems

When you store session data in cookies, you must be absolutely certain that users can’t tamper with the data in any way. There’s no way to keep users from altering the data in a cookie; it’s ridiculously easy. So, in order to ensure that your website doesn’t accept cookies containing altered data, you need to either encrypt the cookie values or sign them with a hash that allows you to verify their integrity. This introduces more complexity into your application and increases the number of possible attack vectors that malicious users can exploit.

Most web frameworks that support cookie-based session stores try to hide this complexity from the developer by performing the cookie validation themselves. However, in order to encrypt or sign cookies securely, you need a secret key (like a password) known only to the server; otherwise, an attacker who knows the hash inputs could easily reproduce the hash and forge a valid cookie signature.

Obviously the web framework can’t choose this secret key for you, so it’s up to the developer to do this. Many web apps that use cookie-based sessions come with pre-configured secret keys. As you might have guessed, this results in epic failure of the worst kind since web developers are lazy and often don’t bother changing the default key. Some folks on the Rails Core mailing list recently discovered this and are currently flipping out about it as if it were some kind of surprise or something.

Do the right thing

Server-based session stores are the industry standard for a very good reason: they work well and they’re relatively foolproof compared to the alternatives. They do put some extra load on the server, and sharing session data among multiple frontend servers can be a tricky problem (although there are plenty of solutions), but it’s almost always better than resorting to a cookie-based session store.

As with most techniques — even techniques with as many downsides as cookie-based session storage — there are some rare cases when it actually makes more sense than the alternative. But these cases are very rare, and cookie-based session storage certainly should not be the default in any web framework.

¹ Rails Trac – Changeset 6184

² YUI Blog – When the Cookie Crumbles

³ Make Data Useful – presentation by Greg Linden at Stanford