If you do anything in the computer security or forensics world, then you probably view Facebook as a hive of scum and villainy. As a major social network, it attracts all sorts of criminal elements. Pedophiles use Facebook. Terrorists use Facebook. Drug dealers use Facebook. It’s like the only people not using Facebook are teens.
Social networks are split into two camps. On one side are the open forums. Everything is accessible and anyone can see content without needing special access. Twitter, Reddit, and most news sites fall into this category. While some content can be made private, most is public.
On the other side are the walled gardens. These are social networks where people on the outside can barely see anything inside. Facebook and Apple are the two big examples. As someone who isn’t on Facebook, I’ve never actually seen FarmVille. And I cannot see most user profiles or “wall” pages without logging in and connecting to users. It’s that “connecting” part that is a problem for law enforcement. The last thing you want to do is tip off a suspect by friending them, just to gain access to their shared information.
One Little Change
Earlier this week, Facebook made a subtle but important change to their service. Specifically, they changed their picture filenames. This, in turn, directly impacts online forensics. Since I’ve been tracking changes at Facebook for years, I’ve managed to put together a pretty good timeline.
Before July 2012, Facebook filenames used a five-number pattern: aa_bb_cc_dd_ee_n.jpg. (For example, 1234_5678_91011_12345_1234_b.jpg.) aa is the photo id, bb is the album id, cc is the profile id of the user who uploaded the picture. The dd and ee fields are random and designed to mitigate guessing a picture’s id. (The dd field may have some other purpose, but I never figured it out.) The final character, n, indicates the size for auto-scaling. Changing the final character to ‘o’ returns the original-size picture, ‘b’ is big, ‘q’ is 180px wide, etc.
Given a Facebook filename in this format, an analyst can quickly identify the URLs to the picture, album, and user’s profile.
In February 2012, Facebook started testing a new filenaming system. This system was fully deployed in July 2012. This new filename format uses a three-number pattern: aa_bb_ee_n.jpg. (E.g., 1234_567891011121_23456_b.jpg.) aa and bb are still the photo and album ids. The ee field is random and designed to prevent someone from guessing a picture’s id. The final character, n, still indicates the size for auto-scaling.
Given a Facebook filename in this three-number format, an analyst can still quickly identify the URL to the picture and wall page. If the picture’s wall page is public, then it displays the user’s account name, the image, and all comments related to the picture.
In October 2012, Facebook began to test an Akamai EdgeControl cache with cryptographic signatures. Akamai provides a last-mile content delivery system for distributing the network load. The cryptographic checksum prevents tampering to the URL. This means that real-time processing instructions in the URL, such as ‘/c92.0.403.403/’ for cropping or the size determination (e.g., ‘n’ or ‘o’), cannot be altered by the analyst. Any changes will return a ‘Content not found’ message.
The caching and anti-tamper system was deployed on 27-Dec-2012. However, the filenames still mapped to non-Akamai URLs for directly accessing the content at Facebook. In addition, relatively few pictures were served through the Akamai caching service.
All of this changed on 24-Nov-2014. (I may be off by a few days for the actual deployment). That’s when Facebook changed filenames again and began to distribute pictures almost exclusively through Akamai and with anti-tamper URLs. Technically, the filename still looks like the three-number format: aa_bb_ee_n.jpg. However, they changed the aa photo ID number in the filename. As a result, all filenames that predate 2014-11-24 can no longer be used to find the direct URL at Facebook. Pictures uploaded after 2014-11-24 may, in rare cases, be mapped to direct URLs at Facebook. But most of the time, they are only available from Akamai. Given only the Facebook filename, you can no longer find the URL to the picture hosted at Facebook. You can still find the wall page with the picture (if it is public), but not the direct URL to the picture itself.
For example, 1000526_539054152803549_1177659804_n.jpg is the filename of a picture that was uploaded to FotoForensics over a year ago. The direct URL to the picture was ‘https://scontent-a-fna.xx.fbcdn.net/hphotos-xfa1/1000526_539054152803549_1177659804_o.jpg’. Prior to 24-Nov-2014, this would return the image, but today it returns ‘Content not found’.
However, from this filename I can identify the wall page’s URL: https://www.facebook.com/photo.php?fbid=539054152803549. (It’s gross, so I’m not hyperlinking to it.) According to the wall page, the picture’s new filename is 1014640_539054152803549_1177659804_o.jpg — the first number changed from 1000526 to 1014640.
‘Good news’ is relative
The impact to forensics and investigators is significant. If you have a filename that matches the three-number format, then you can trace the filename to Facebook. But if the file was acquired before 2014-11-24, then you cannot find the direct URL at Facebook in order to confirm that the file came from there. (By seeing the picture at Facebook, I suspect that law enforcement would have an easier time getting a warrant. Without the confirmation, it should be a little harder.) By the same means, some media outlets try to validate sources. They used to be able to confirm that a picture came from Facebook by tracing the filename to a URL. But today, they cannot positively confirm it unless the wall page is public.
In addition, anyone who was hotlinking to a picture at Facebook should have noticed that the link is now broken. In effect, Facebook just raised the walls a little higher around their private garden.
If someone sees a filename from Facebook, then it can no longer be traced back to the user. And if the URL contains an anti-tampering field (my example Facebook filename above did not have this field), then nobody can uncrop the image without more knowledge about where the picture is stored at Facebook. This stops people from snooping, law enforcement from tracking images without a warrant, and external web sites from hotlinking.
And the bad news?
Privacy advocates may be very pleased with this change. However, I think all of the privacy benefits are a side-effect from something much more detrimental. Since I have no insider knowledge about Facebook, I can only speculate about the cause behind this naming change. And I suspect that the cause is very anti-privacy.
Now we have Facebook, a giant company that can only collect information at Facebook, teaming up with Akamai, a giant company that can cross-collect information from a third of the Internet. It used to be that Facebook could only track you at third-party sites if their site had a link to Facebook. I previously showed how Facebook uses links at Home Depot to track users who visit this home improvement online store. But now, sites do not even need to have a link to Facebook.
Let’s trace how this entire thing works now. You visit a web site that is not a Facebook affiliate and has no link to Facebook. But, they do have a small ad that is hosted at Akamai. As your browser downloads the picture for the ad from Akamai, your browser (via the HTTP referer [sic] field) provides information about what site you are visiting. Akamai can even drop a cookie into your browser, just in case you change network addresses. (While not essential, the cookie simplifies following mobile devices.) Later, you go to some site that has a “Like us on Facebook” link with code hosted at Facebook and an image from Akamai. Now Akamai can put it all together and provide it to Facebook. They know the sites you visit, when you visited them, and what your interests are outside of Facebook. They can tie this together with Facebook information, so they further know your likes, friends, interests, etc.
Moreover, the list of Akamai customers is huge! Best Buy, NPR, MySpace, McAfee… Facebook can now see into the walled gardens at Apple and Microsoft, since both of them are Akamai clients. The Department of Defense is listed as an Akamai customer… I wonder if Facebook can identify DoD employees? The same goes for the Australian Government National Security (another Akamai client).
Tis the season
But let’s go back to pictures. Why would Facebook change their filenames? The only reason that makes sense to me is that they intentionally want to break links for anyone hotlinking to their site. They are effectively drawing a line in the sand and saying “this is the baseline” for all new data collected.
Finally, I couldn’t help but notice that they rolled all of this out days before Thanksgiving and the start of the holiday shopping season. This year, an estimated 37% of shoppers are expected to shop online, and nearly all of them will trigger at least one Akamai or Facebook tracker.
Ho ho ho…