This post was syndicated from: The Hacker Factor Blog and was written by: The Hacker Factor Blog. Original post: at The Hacker Factor Blog
There’s a question that I often receive regarding photos: Where was this picture taken? Basically, they have a photo and want to identify the location. This comes up in legal cases, media requests, and just odd photos found online. (With news outlets, they usually follow it up with “and when was it taken?”) Tracking a photo to a location is usually a very difficult problem. Unfortunately, there are no generic or automated solutions.
However, just because it is a hard problem does not mean it is impossible. (Sometimes it is impossible, but not always.) Usually it just takes time and a dedication to tracking down clues.
The easy way
When people think about identifying where a photo was taken, they immediately think about embedded GPS coordinates. And the truth is, if GPS information exists in the picture’s metadata, then that is a great place to begin.
Unfortunately, very very very few pictures contain GPS information. At FotoForensics, we’re getting close to a half-million unique picture uploads, and only about 1% of them contain GPS metadata. There are reasons that GPS information is so hard to find:
- Unavailable. GPS data is almost exclusively associated with smartphones. Very few point-and-shoot cameras have built-in GPS.
- Disabled. For devices with GPS chips, there is usually an option to disable geo-stamping photos. Some devices default to “off” and are never turned on, while others may default to “on” but have users intentionally turn it off. There’s also the GPS system itself; lots of people turn off GPS on their smartphones because it will drain your battery. If your phone’s GPS is disabled then your camera will not include GPS information in the picture.
There are other ways for a device to geolocate without using GPS. Some smartphones can get a rough estimate using nearby wireless access point identifiers (SSIDs) or by finding nearby cell towers. But to the camera’s function that looks up GPS information, this is all the same. If your device cannot geolocate then there will not be a location recorded with the picture.
- Stripped. Processing a picture with a graphics program, or uploading it to an online service like Facebook or Twitter, can (and usually will) alter or remove metadata. This includes removing GPS information. Even if the data was there at the beginning, it is not there anymore.
Of course, even if the GPS information is present, it does not mean it is accurate. I’m sure that people with smartphones have noticed the accuracy issue. When you first turn on the mapping program, it will draw a huge circle on the map. The circle may span a couple of miles. It does not mean that you are in the center of the circle; it’s indicating that you are “somewhere” in that circle — you could be near the center or somewhere along the edge. After a few minutes, the device has time to synchronize and better narrow down the region — denoted with a smaller circle. Eventually it may become a dot that identifies your location to within a few feet.
With GPS metadata, there are fields for location and accuracy. Unfortunately, most mobile devices only fill out the location data and not the accuracy information. This means that the extremely precise GPS location stored in the metadata may be off by a mile. Even if the GPS location pinpoints a house, you cannot be certain that the photo was taken in that house — it could have been captured a half-mile away.
Another place to look is in metadata annotations. If the picture came from a media outlet, then there’s probably metadata that identifies “where” the photo was taken, even if it is just a city name. Unfortunately, most online news sites resave images prior to publishing, and that can strip out these annotations.
GPS information and annotations in metadata are nice when they exist. Unfortunately, they may not exist. And even if they are present, they may still not be very accurate or reliable. That means geolocating a photo must rely on the photo’s content. There are different clues in the photo’s content that may help identify the location. Some of these may be very precise (geolocation) while others may help you narrow down a region (geo-fencing), country, or at least rule out some parts of the world.
The easiest photos are the ones with unique and notable landmarks: statues, distinct buildings, street signs… Even photos of mountain ranges or generic streets may be enough to find the location. If the camera was fairly close to the subject, then you can probably identify the photographer’s position to within a few feet. A long distance shot may narrow it down to an area.
For very notable objects, such as scenic views, distinct statues, or elements seen at tourist stops, you may be able to find the location by uploading the picture to TinEye or Google Image Search. If other people have photographed the same object from about the same position, then these image search engines may be able to identify other photos from the same spot.
In my opinion, TinEye is better at finding similar photos, but Google may annotate the search results with a text name or description. In either case, you will probably need to visit the resulting web pages in order to see if any page mentions where the photographer was located. (Knowing that the photo’s content shows “New York City” is not the same as geolocating a photographer who was standing at the foot of the Statue of Liberty.)
Different cities and countries have different building styles. If you can identify the style, then you may be able to identify where the photo was taken. There’s been a few advances in this research area (for example, PDF). Unfortunately, as far as I know, there are no public image search engines that do this type of matching.
Usually, you just happen to find someone who recognizes the style and can help narrow down a location. (That’s one of the benefits of turning a photo over to a large social group like Reddit — there is likely someone who will recognize something.) However, even this can be somewhat inaccurate. For example, neighboring countries (e.g., Poland and Germany) can have similar architectural styles. In California, there’s a city called Solvang that looks like Denmark. Most American cities have a “Chinatown” that uses Chinese architecture, and China has rebuilt cities from countries like France and Italy.
If you cannot identify a city or a country, then you can probably still identify regions to exclude. For example, do you see any text in the photo? If the street signs are only in English, then you are probably not looking at any Asian, African, or middle-Eastern countries. (Non-English speaking countries either do not use English letters or include multiple languages on the signs.)
Currency can be another great clue. If I see Mexican pesos, then I’m thinking Mexico. Sure, it could be a Spanish-language classroom in the United States, but then other clues would tip you off that it’s a classroom. (Like maybe, desks?) It could also be someone from Mexico who lives in Canada and has decorated his home with trinkets from his homeland. But unless you have a reason to suspect another country, a best-guess is to use what you see. If everything looks like Mexico, then it’s probably Mexico.
Exclusion cannot tell you where a photo was taken. However, it can help identify where the photo was not taken. (Photo showing a tropical beach? It’s probably not the South Pole or Northern Europe.)
To give you an example of geolocation, consider this photo that was recently trending at FotoForensics:
My question is: where was this photo taken? Or more specifically, where was the photographer standing and what direction was the photographer facing?
Sure, you could go to the forum where the picture was being discussed and the city is identified, but let’s assume that you do not have that information. (And anyway, the forum does not tell you the exact location where the photographer was standing or the direction the camera is facing.) In real life, you may have nothing more than a photo; assume that you just have this photo and nothing else. Also, let’s assume that you are like me and you do not know the area and do not recognize the street.
Here’s how I walked through it to identify the location (your approach may be different):
- Metadata. First, let’s go for the easy clues and start with the metadata. Maybe we will get lucky and find GPS coordinates or a textual description. Unfortunately, this picture has no informative metadata. (It’s been stripped, but it was still worth the time to look.)
- Search. Using TinEye and Google Image Search turned up no useful results.
- License Plates. Someday I hope to have a database of license plate formats (colors, layouts, etc.), but I do not have that today. However, I know that long, rectangular, and yellow (with or without the blue strip on the left) is European. So I can immediately rule out Africa, Asian, Australia, North America, and South America. (While the cars could have been shipped to another country, we go with what it most likely.)
- English. All of the text is in English. European and English-only? That’s an island like England, Ireland, or Scotland. It’s not the European mainland. (This is geo-fencing — narrowing down a location to a region or area.)
- Bank. Now I can start looking up text. I see an HSBC ATM machine. I know that HSBC is a bank and it’s found in the British Isles. (While HSBC is found in lots of other countries, it does not exclude my current geo-fenced area.)
- Store. I do not know what “Waitrose” is, but I can type the word into Google. It turns out, Waitrose is a grocery store in England. That narrows down my search to one of about 300 locations. (I know, 300 seems like a lot, but it’s smaller than “anywhere in the world.”)
- Web. The Waitrose corporate website allows you to select a branch. (There’s 339 of them right now.) Each branch contains a small picture of the location. Non-programmers will need to go one-by-one and look at each picture. Fortunately, I’m a programmer. It took me a few minutes to write a small script to harvest all of their store pictures. I thought I would use these thumbnail images to rule out locations. (No red brick. No black awning. Not on a corner…) Instead, I got lucky:
The green advertisement on the wall in the photo is blue in the thumbnail, and the HSBC ATM is missing, but it’s the same location. According to their corporate headquarters, this is Waitrose Wilmslow.
- Address. Unfortunately, the corporate web site does not provide a numerical street address or GPS location. All they say is: “Church Street, Wilmslow, Cheshire, SK9 1AY”. (Not being from England, this looks to me like a description and not a mailing address.) Fortunately, I can type this into Google Maps and find the street. Using Google Street View, I can find the address: 4 Church Street, Wilmslow, England, UK.
The street view shows me the exact location. The photographer had to be standing in the street, facing North. (Not where the mouse has highlighted the road — the photographer was standing a little to the right.) Even if he was using a telephoto lens, he would still need to be somewhere down the street, facing North.
Now we have answered the questions. We know where the photographer was standing and the direction the camera was facing.
Armed with this information, there’s a few other things I can now tell about this photo. For example, the Google Street View shows that there are cameras everywhere. You can even see one in the photo above the “Waitrose” sign. If this photo was showing a crime, then there are cameras that recorded the photographer.
Looking at the shadows, we can see that they fall to the North (toward the store) and not to the left or right. So this was likely taken in the middle of the day. And is that the photographer reflected in the car’s mirror?
The corporate web site’s thumbnail was timestamped November 2010 and it lacked the ATM. The Google Street View is timestamped (lower-left) September 2012 and it shows the ATM. So sometime between November 2010 and September 2012, the ATM was installed. This means that the photo was taken sometime after November 2010. If I contacted Waitrose, then I suspect we could narrow down the date based on the advertisements that are visible. While we probably would not find the exact date, I believe that we could narrow it down to a month or less. Together with the camera information (assuming at least one camera on the street still has the pictures available), we can even identify the exact moment — and possibly even watch the photographer come and go.
With Google Street View, we can even tell a little more about the building. For example, watching the building while moving down the street permits us to see the framed advertisement change. It it a scrolling billboard. The green advertisement in the photo, the blue advertisement in the corporate thumbnail, and the picture seen in the Google Street View could all be part of the same scrolling ad series.
Using Bing’s street view of the same address (requires Internet Explorer), there is one image that shows part of the green banner scrolling into place. So it is part of the rotation cycle. Unfortunately, Bing doesn’t display any date information related to the street view. However… In the photo’s upper-left corner is a yellow and black sign. This same sign is seen in the Google Street View, but it is not present in the Bing street view. If we knew when that black-and-yellow sign appeared, then we could further narrow down the date range.
(If we cheat, then we can look at the forum. The posting was made on 21-November-2013, so the date range is November 2010 to 21-November-2013. The person claims to have taken the photo “a few weeks ago”, so that would be October or early November 2013.)
Needles and Haystacks
The good news is that many pictures can be geolocated to a specific location. However, there is no generic or automated solution. Right now, every photo is a unique challenge, and some may be very time-consuming.
(And for the people who really want to know: I think the license plates are real. It’s hard to tell from the photo due to multiple resaves, but the UK permits people to look up the vehicles based on the plate and manufacturer. Both license plates exist and they match the vehicles.)