Happenings

Decompression Bombs Part 2

The piece on decompression bombs was not supposed to be a panic piece, as it seems is implied, rather it was an informative one about a hidden danger in handling compressed files and, in my view, a neat little trick.

To respond to some questions about it: someone asked how you would create the compressed file in the first place since compressing that much data would have to be done in memory, causing you the same problem. Very astute! The answer is that you don’t: the uncompressed data never exists. You need to know enough about your compression algorithm to construct the compressed file directly, writing the output without any real input. This is not that tricky for most formats. The fact of the matter is that you can download decompression bombs quite freely.

In response to a comment by David Russell, and to illustrate some points more clearly, I’m going to discuss some more concrete examples. First of all, David raises the issue that this just means “you restart”. Naively, yes, it does. A restart might not mean much to you, the home user, but it does to quite a few companies. Servers going down means a tangible loss in revenue and respect in the marketplace.

Moreover, denial of service attacks are used to blackmail companies. Mostly these come in the form of botnet attacks, but there’s no reason why a weak company couldn’t be taken down by decompression attacks. Frankly, without adequate defences, a decompression attack is far more effective than malicious pings.

Here are a few scenarios for you:

  1. You are an admin for a company with a reasonable IT infrastructure. Being sensible, your mail server scans all incoming attachments for problems. You’ve also put the servers into a replication configuration so that if one server goes down then your mail queues redirect. Someone sends you a decompression bomb; your mail server goes down pretty quickly because the scan tries to handle it in memory. For every minute the servers are down, you lose X pounds. Depending on how much you’ve thought about this, X can be a large number. Your redundant servers can’t easily help because they’ll start scanning the file and falling over. You have a few solutions: you can switch off scanning (obviously dangerous). Maybe you can switch off replication, though that generally means isolating that queue and dropping the mail on the floor while you figure out what has happened, potentially costing more money.
  2. You are an image storage/processing company who let your users upload images (like, for example, flickr). People upload whatever photos they like and let you display them. Because you want to save a considerable amount of bandwidth, you accept GZip encoded transfer of images. This saves you Y million a year (a conservative estimate for a large site). Someone sends you an image decompression bomb, which expands to some large amount on your servers. Your resources get drained while you either try storing it or sort it out. What can you do? Maybe you take the one time hit in writing it to disk; poor strategy if someone is using this as part of an effective distributed denial of service attack on your site. Maybe you just kill the GZip upload which, as we know, will cost you at least Y million a year; not a good idea if you’re looking to get promoted any time soon.

There are solutions to these scenarios (which I would like to leave as an exercise for my readers, comments anyone?), and I’m sure there are also more tricky scenarios for the really devious (again, I’d love to hear something quite sneaky). The point is that if you accept any content which is compressed, you should be aware of how this could affect you.

Three Years

Solitude, as of today, has been running for three full years. Bloody hell. I’ve been going over a few random archive posts and it really is amazing how much the tone and style has changed over that time. I look back to early posts and see just how much my focus and enthusiasm has changed, I’m finding it much harder to post these days, particularly about technical issues which is bizarre since I understand technology and code far better now than ever. Half of the stuff I think about doing doesn’t seem as worthy of a post these days: I assume most of the people who read this would already know what I’m talking about and don’t need it again, and I certainly wouldn’t be writing it for myself.

As has become tradition on this site, last year I made predictions about what to expect on Solitude and got it hugely wrong. The Solitude sister project (code named “P”) failed to appear because I did literally nothing towards it. The post rate was light in the first half of the year, but got worse in the second half rather than better. Hooray!

Predictions for this year? More code. I was far too apathetic towards coding last year. Having done so much to finish my degree, then doing it as a full time job for the rest of the year, the last thing I wanted to do was come home and code more. However, in that time, the list of projects that I wanted to work on grew hugely (there are over 50 things on the list now). I hope that I’ll have time for at least some of them. Also, I will make an attempt to work on “P”. I think it would be good for me, a very different kind of website to Solitude. Of course, given past predictions, none of that will happen.

To anyone who has ever read this site, or who continues to do so despite its shortcomings, thanks. Hope you stick around for another year.

Decompression Bombs

From random discussions I’ve had with people over the last few months, it seems to me that not too many people really know about a particularly nasty and cunning form of computer attack: the decompression bomb.

First, think about compressing data. Most of us have used an application like WinZip or some similar tool to make our files smaller, albeit less accessible temporarily. These programs take advantage of how signal to noise ratios work in most representations of data; that is, the way we try to represent information is almost always ineffecient. There are usually techniques available for removing the parts that aren’t so important, and letting important (or common) information take less space.

For example, the algorithm behind the ZIP programs is largely based on LZW compression: a technique that collapses common substrings down into single codes. Obviously, I’ve completely glossed over how it works, but if you want to know the information is out there (any computing scientist worth their salt should have at least an idea of how this sort of thing works).

Using a randomly selected document of, say, English text, you would expect a maximum compression ratio of 10:1. That is, given a 100k file, you would expect a zip file of around 10k. That’s pretty good compression! The language being used and the specific text will cause the compression ratio to vary greatly, but the principles remain the same.

The thing is, if you carefully construct an example document, you can get a compression ratio much higher. How much higher? MUCH, MUCH higher. For example, if you created a PNG image containing just one colour repeated over and over then you could easily get a 1000:1 ratio. For a text document containing 1 character repeated over and over, it’s possible to shrink 100Gb to about 6k. Think about that, it is a huge difference: 1.7e7:1.

That’s all well and good as an interesting experiment, but what does it mean for an average user? Imagine I had constructed one of those zip files that had shrunk 100Gb down to 6k and I sent you that file. If you trusted me, you might try to open it. Therein lies the problem: while you can readily accept the zipped file, the chances that you have the 100Gb of free memory (including virtual memory) to accomodate the decompressed file are bloody slim. When you try to open one of these files, your computer will quickly become overwhelmed and stop responding; all of the free memory having been used up, it can’t do anything else. You effectively suffer a denial of service attack.

That is what we call a decompression bomb.

There is another factor that could cause problems for people who are careful when opening files: well-meaning programs can open them anyway. If the file arrives on your system (either by explicit downloading or by, say, a mail program fetching it), it’s likely that anti-virus software installed on your system would then want to check if the file contained any viruses. To do this, it pretty much has to decompress the file in memory, leading to the same problem. Oh dear.

Most modern anti-virus software has some defences against decompression bombs, but they can still cause significant system lock-ups while figuring it out. Perhaps more evil is compressed web content, whether images or GZip encoded HTML. No modern browser has a strong defence against decompression bombs. Relatively small files (100Mb decompressed) are usually handled quite well, in that the browser doesn’t crash completely, but go much bigger and most systems will run into trouble. Because browsers, sensibly, accept GZip encoding by default, any URL can hide a problem.

Thankfully, problems don’t arise much in practice because there is little to be gained from this activity: if you take out someone’s system, you’ve annoyed them but you can rarely turn this to your advantage i.e. you can’t elicitly install spyware.

If you want to see some more figures or examples (at your own risk), then the AERAsec decompression bomb page is a great start. It’s where I got a few of my figures from (so thanks for that!), and has a link to some real examples you can try.

Now, Decompression Bombs Part 2 outlines some real-life examples and answers a few questions!

Bus Avoidance

Around my first year of university, I noticed that people would go out of their way to avoid sitting next to me on the bus. Only once every other seat on the bus, including the few that face backwards that everyone hates, were filled would I be graced with human company. At first, this was great. I’m not much of a morning person so this afforded me the luxury of stretching out a bit more and getting some sleep on the way in.

After a while, though, it makes I got curious as to why people were avoiding me. Maybe I looked deranged (no jokes, please, we’re all above that), maybe I smelled (again… grow up), maybe it was something else. What had changed since I went to university? It didn’t take long to figure out that I now had long hair when previously I did not.

A few years later, I cut my hair right down and, sure enough, I seemed to be treated like everyone else on the bus. Maybe I’m not the first person people sit next to, that honour usually goes to someone prettier and more female than myself, crazy as that may sound, but I’m usually not the last either. Hooray.

Recently, however, I’ve noticed that it’s happening again and I don’t have long hhair as an excuse. In fact, it’s worse: they’ve started standing instead of sitting down. That’s quite worrying.

I figured out exactly what the problem is though by watching when it happens. It only happens when I’m watching an episode of Family Guy or listening to the Ricky Gervais podcast. Aha! The seemingly random laughter and, on one occassion, crying (from said laughter) of a stranger makes them not want to sit down: they think I am a lunatic, when in fact I am just well entertained. Not that there’s much difference.

Take from that what you will, maybe the same is happening to you, maybe you just want the seat next to you to be empty in the morning. Now you know how.

Derren Brown: Heist

Since Channel 4 seem incapable of advertising anything I’ve ever been waiting for them to screen (for example, the American Gothic re-runs), I shall take it upon myself to point out that the newest Derren Brown special is on in 20 minutes. Only spotted because of sheer fluke.

It’s called The Heist, and the boy wonder is planning on convincing people to rob a bank. You just know he’ll succeed, but it’s the frightening ease with which he does things that makes him so entertaining.

Go! Watch!

Update: Terrifying. Definitely worth a watch on the next repeat, or when the torrents appear in about an hours time.

Update 2: Might I suggest that the huge number of people arriving on this post looking for a torrent try somewhere else that might have it like UKNova