This post was syndicated from: The Hacker Factor Blog and was written by: The Hacker Factor Blog. Original post: at The Hacker Factor Blog
Over the last few months, I have been enhancing my network defenses at FotoForensics. It is no longer enough to just block bad actors. I want to know who is attacking me, what is their approach and any other information. I want to “know my enemy”.
To gain more knowledge, I’ve begun to incorporate DNS blacklist data. There are dozens of servers out there that use DNS to query various non-network databases. Rather than resolving a hostname to a network address, they encode an address in the hostname and return a code that denotes the type of threat.
This becomes extremely useful when identifying a network attacker. Is the attack personal, or is it a known bot that is attacking everyone on the Internet? Is it an isolated system, or part of a larger network? Is it coming directly, or hiding behind an anonymous proxy network? Today, FotoForensics just identifies attacks. Tomorrow it will identify attackers.
Reputation at Stake
Beyond DNS lookup systems, there are some reputation-based query services. While the DNS blocklist (dnsbl) services are generally good, the web based reputation systems vary in quality. Many of them seem to be so inaccurate or misleading as to be borderline snake oil.
Back in 2012, I wrote about Websense and how their system really failed to work. This time, I decided to look at Trend Micro.
The Trend Micro Site Safety system takes a URL and tells you whether it is known hostile or not. It even classifies the type of service, such as shopping, social, or adult content.
To test this system, I used FotoForensics as the URL.
According to them, my server has never been evaluated before and they would evaluate it now. I looked at my logs and, yes, I saw them query my site:
22.214.171.124 – – [05/Oct/2014:06:35:45 -0500] “GET / HTTP/1.1″ 200 139 “-” “Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)”
126.96.36.199 – – [05/Oct/2014:06:52:05 -0500] “GET / HTTP/1.1″ 200 139 “-” “Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)”
The first thing I noticed is that it never crawled my site. It only looked at the opening page. They downloaded no images, ignored the CSS, and didn’t look anywhere else for text to analyze. This is the same mistake that Websense made.
The second thing I noticed was the lie: They say that they have never looked at my site before. Let’s look back at my site’s web logs. The site was publicly announced on February 9, 2012.
188.8.131.52 – – [10/Feb/2012:03:28:56 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
184.108.40.206 – – [10/Feb/2012:05:55:32 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
220.127.116.11 – – [10/Feb/2012:12:16:30 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
18.104.22.168 – – [10/Feb/2012:12:16:34 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
22.214.171.124 – – [10/Feb/2012:23:21:51 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
126.96.36.199 – – [11/Feb/2012:06:51:29 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
188.8.131.52 – – [14/Feb/2012:02:00:48 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
184.108.40.206 – – [14/Feb/2012:03:24:48 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
220.127.116.11 – – [14/Feb/2012:03:25:51 -0600] “GET / HTTP/1.0″ 200 1865 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
All of these network addresses are Trend Micro. According to my logs, they visit my site and the “/” URI often. Usually multiple times per day. There were 1,073 visits in 2012, 4,443 visits in 2013, and over 2,463 (so far) in 2014. So I certainly do not believe them when their page says that they have never visited the site before.
Since they clearly scanned my site, I wanted to see the results. I rechecked their service an hour later, a few hours later, and over a day later. They still report it as being untested.
Trend Micro first got my attention back in 2012, when they were automatically banned for uploading porn to FotoForensics. What happened: someone using Trend Micro’s reputation software uploaded porn to FotoForensics. A few seconds later, the reputation system checked the URL and ended up uploading the same porn, resulting in an automatic ban.
This second upload happened because Trend Micro was checking a GET request. And in this case, the GET was used to upload a URL containing porn to my site.
Besides performing a double upload, these duplicate requests can cause other problems. You know all of those online shopping sites that say “Do not reload or your credit card may be charged a second time”? Trend Micro might cause that second click since they resubmit URLs.
I also noticed that Trend Micro’s reputation checker would occasionally upload URL-based attacks to my site. It appears that some bad guys may be using the reputation checker to proxy attacks. This way, if the attack succeeds then the bad guy can get in. And if it is noticed, then the sysadmins would blame Trend Micro. (These attacks stand out in my logs because Trend Micro uploads a URL that nobody else uploads.)
I tried to report these issues to Trend Micro. Back in 2012, I had a short email exchange with them that focused on the porn uploads. (The reply basically disregarded the concern.) Later, in 2013, I provided more detail about the proxy abuses to a Trend Micro employee during a trip to California. But nothing was ever done about it.
One of the biggest issues that I have with Trend Micro’s reputation system is the order of events.
In my logs, I can see a user visiting a unique URL recorded in my logs. Shortly after the visit, Trend Micro visits the same URL. This means that the user was allowed to visit the site before the reputation was checked. If the site happens to be hostile or contains undesirable content, then the user would be alerted after the fact. If the site hosts malware, then you’re infected long before Trend Micro would alert you of the risk.
To put it bluntly: if Trend Micro is going to be warning users based on a site’s reputation, then shouldn’t the warning come before the user is granted access to the site?
Quality of Reporting
Trend Micro permits users to submit URLs and see the site’s reputation. I decided to spot-check their system. I submitted some friendly sites, some known to be hostile, and some that host adult content… Here are some of the results:
- Trend Micro: “Safe.”
- Symantec: “Safe.”
- Google: “Safe.”
- Craigslist: “Safe.” Really? Craigslist? Perhaps they didn’t scan some of the escort listings. I gave it the URL to Craigslist “Washington, DC personals”. (I had searched Google for ‘craigslist escorts’ and this was the first result. Thanks DC!) This URL is anything but family friendly. There are subjects like “Sub Bottom Seeking Top/Master for Use and Domination” and “Sunday Closeted Discreet Afternoon Delight”. The Trend Micro result? “Safe”. They classified it as “Shopping” and not adult content.
- Super T[redacted] (a known porn site). Trend Micro said “Safe” but classified it as “Pornography”. This is a great result and gives me hope for Trend Micro’s system.
- Singapore Girls. “Safe” and “Adult / Mature Content”. Most other services classify this as a porn site. I wonder where they draw the line between pornography and mature content…
- Backpage: “Safe, Newsgroups, Shopping, Travel”. I then uploaded the link to their escorts page, which has some explicit nudity and lots of adult content. The result? Safe, Newsgroups, Shopping, Travel. Not even an “18”, “Adult”, or “Mature Content” rating.
- Pinkbike.com: “Safe, Sports”. I’ll agree with this. It’s a forum for bicycle enthusiasts.
- 4chan: “Dangerous, Disease Vector”. According to Trend Micro, “Dangerous” means the site hosts malware or phishing. 4chan is not a phishing site, and I have never seen malware hosted at 4chan. While 4chan users may not be friendly, the site does not host malware. This strikes me as a gross misclassification. It should probably be “Safe, 18+, Adult Content, Social Network”.
- >_: One of 4chan’s channels goes by the name “/b/”. If you see someone talking about “/b/”, then you know they are talking about 4chan’s random topics channel. Outside of 4chan, other forums have their own symbolic names. If you don’t know the site represented by “>_”, then that’s your problem — I’m not going to list the name here. This is a German picture sharing site that has more porn than 4chan. While 4chan has some channels that are close to family friendly, the >_ site is totally not safe for work. Trend Micro says “Safe, Blog/Web Communications”. They say nothing about adult content; if Trend Micro thinks 4chan is “dangerous”, then >_ should be at least as dangerous.
With Filename Ballistics, I have been able to map out a lot of sites that use custom filename formats. A side effect is that I know which sites host mostly family friendly content, and which do not. I submitted enough URLs to Trend Micro’s reputation system that they began to prompt me with a captcha to make sure I’m human. The good news is that they knew most of the sites. The bad news is that they missed a lot of the sites that host adult content and malware.
For a comparison, I tested a few of these sites against McAfee’s SiteAdvisor. McAffee actually scanned FotoForensics and reported “This link is safe. We tested it and didn’t find any significant security issues.” They classified it as “Media Sharing”. (I’ll agree with that.) McAfee also reports that 4chan is suspicious and a “parked domain” (WTF?), they reported that >_ wasn’t tested, Craigslist is a safe (no malware) site with “Forum/Bulletin Boards”, and Singapore Girls is “Safe: Provocative Attire”.
Other Lookup Tools
Trend Micro also has an IP address reputation tool. This is supposed to be used to identify whether an address is associated with known spammers, malware, or other online attacks.
At FotoForensics, I’ve been actively detecting, and in some cases blocking, hostile network addresses. I use a combination of a custom intrusion detection system and known DNS Blacklist services. This has dramatically cut down on the number of attacks and other abuses against my site.
I uploaded a couple of network addresses to Trend Micro in order to see how they assigned reputations:
- 18.104.22.168. The Honeynet Project reports “Suspicious behavior, Comment spammer”. Tornevall.org reports “Service abuse (spam, web attacks)”. My own system identified this as a TOR node. Trend Micro reports: “Unlisted in the spam sender list”.
- 22.214.171.124. This is another Suspicious behavior, Comment spammer, TOR node. Trend Micro fails to identify it.
- 126.96.36.199. Honeynet Project says “Search engine, Link crawler”. My system identifies it as Googlebot. Trend Micro says “Unlisted in the spam sender list”. This looks like their generic message for “not in any of our databases”.
Keep in mind, tracking TOR nodes is really easy. If you run a TOR client, then your client automatically downloads the TOR node list. For me, I just start up a client, download the list, and kill the client without ever using TOR. This gives me a large list of TOR nodes and I can just focus on the exit nodes.
At minimum, every TOR node should be associated with “Suspicious behavior”. This is not because users using TOR are suspicious. Rather, it is because there are too many attackers who use TOR for anonymity. As a hosting site, I don’t know who is coming from TOR and the odds of it being an attacker or spammer are very high compared to non-TOR users.
- 188.8.131.52. This system actively attacked my site. Trend Micro reports “Bad: DUL”. As they define it, “the Dynamic User List (DUL) includes IP addresses in dynamic ranges identified by ISPs. Most legitimate mail sources have static IP addresses.”
A Dynamic User List (aka Dialup User List), contains known subnets that are dynamically assigned by ISPs to their customers. These lists are highly controversial. On one hand, they force ISPs to regulate all email and take responsibility for reducing spam. On the other hand, this approach ends up blacklisting a quarter in the Internet. In effect, users must send email through their ISP’s system and users are treated as spammers before ever sending an email.
If this sounds like the Net Neutrality debate, then you’re right — it is. Net neutrality means that ISPs cannot filter spam; all packets are equal, including spam. Without neutrality, ISPs are forced to help mitigate spam.
The good news is, Trend Micro noticed that this was an untrustworthy address. The bad news is that they classified it because it was a dynamic address and not because they actually noticed it doing anything hostile.
- 184.108.40.206. This system scanned my site for vulnerabilities. The Honeynet Project says “Suspicious behavior, Comment spammer”. Tornevall.org reports “Service abuse (spam, web attacks)”. Trend Micro? “Unlisted in the spam sender list”.
Alright… so maybe Trend Micro is only looking for spammers… Let’s submit some IP addresses from spam that I recently received.
- 220.127.116.11. Other systems report: Suspicious behavior, Comment spammer, Known proxy, Service abuse (spam, web attacks), Network attack. Trend Micro reports “Bad: DUL”. Again, Trend Micro flagged it because it is a dynamic address and not because they noticed it doing anything bad.
- 18.104.22.168. None of the services, including Trend Micro, identify it. As far as I can tell, anything from 22.214.171.124/18 is spam. This range currently accounts for a solid 30% of all spam to my own honeypot addresses.
- 126.96.36.199. The blacklist services that I use found nothing. JP SURBL identifies it as a spammer’s address. Trend Micro reports it as “Bad: QIL”. According to them, the Quick IP Lookup Database (QIL) stores known addresses used by botnets.
Botnets typically consist of a wide range of compromised computers that work together as a unit. If you map botnet nodes to geolocations, you will usually see them all over and not limited to a specific subnet. In contrast to botnets, many spammers acquire large sequential ranges of network addresses (subnets) from unsuspecting (or uncaring) hosting sites. They use the addresses for spending spam. While these spam systems may be remotely controlled, the tight cluster is usually not managed like a botnet.
According to my own spam collection, this specific address is part of spam network that spans multiple subnets (188.8.131.52 – 184.108.40.206 and 220.127.116.11/22). Each subnet runs a different mailing campaign and they are constantly sending spam. This same spammer recently added another subnet: 18.104.22.168/18, which includes the previous 22.214.171.124 address.
This spammer has been active for months, and Trend Micro flags it. However, they identify it as “Bad: QIL”. As far as I can tell, this address is not part of a botnet (as identified by QIL). Instead, this spammer signs up with various hosting sites and spams like crazy until getting booted out.
- 126.96.36.199. Most services, including Trend Micro, missed this. Barracuda Central was the only one to identify this as a spammer.
- 188.8.131.52. This spammer is listed in the uceprotect.net blacklist, but Trend Micro does not identify it.
- 184.108.40.206. I caught this address searching my site for email addresses and sending me spam. It is both a harvester and a spammer. This network address is listed in 7 different spam blacklists. Trend Micro reports it as a botnet (Bad: QIL), but it’s actually part of a sequential set of network addresses (a subnet) used by a spammer.
- 220.127.116.11. This address is detected by tor.dnsbl.sectoor.de, but not Trend Micro.
I certainly cannot fault Trend Micro for failing to identify most of the addresses used for spam. Of the 27 public spam blacklists and the three other blacklists that I checked, all of them had high false-negative rates and failed to associate most addresses with spam. Every blacklist seems to have their own tiny, independent set of addresses and there is little overlap. However, Trend Micro’s subset seems to be significantly smaller than other blacklists. The best ones based on my own testing are b.barracudacentral.org and dnsbl.sorbs.net.
In general, Trend Micro seems correct about the addresses that it flags as hostile — as long as you ignore the cause. Unfortunately, Trend Micro doesn’t know many addresses. You should expect a high false-positive rate due to generic inclusions, and a high false-negative rate (Type-II error) due to their small database. Moreover, many of their matches appear to be caught by generic rules that classify large swaths of addresses as hostile without any proof, rather than any actual detection. This means Trend Micro has a high degree of Type-III errors: correct by coincidence. Finally, Trend Micro seems to classify any network of spam systems as a “botnet”, even when they are not botnets.
About Trend Micro
Trend Micro offers many products. Their anti-virus software is as good as any. But that’s not saying much. As Trend Micro’s CEO Eva Chen said in 2008, “I’ve been feeling that the anti-virus industry sucks. If you have 5.5 million new viruses out there how can you claim this industry is doing the right job?”
Virtually the entire AV industry is reactive. First they detect an outbreak, then then evaluate the problem and create a detector. This means that the attackers always have the advantage: they can create new malware faster than the AV companies can detect and react. The move toward a reputation-based system as a predictor is a good preventative approach.
Unfortunately, Trend Micro seems to have seriously dropped the ball. Their system lies about whether a site has been checked and they validate the site after the user has accessed it. They usually report incorrect information, and when they are right, it seems due more to coincidence than accuracy.
I do believe that a profiling system integrated with a real-time detector is a great proactive alternative. However, Trend Micro’s offerings lack the necessary accuracy, dependability, and reputation. As I reflect on Eva Chen’s statement, I can only conclude one thing: it’s been six years, and they still suck.