Up until now, we’ve focused on the theory of ad fraud, or how we at Adjust think about the problem before we take action.
However, we haven’t yet established how these ideas are embedded in the real world, and what kind of solutions they’ve helped to create.
Now that we understand why each step of fraud prevention is so important, let’s apply our knowledge to the challenges ad fraud presents. In this final article in our fraud theory series, we look over three methods of ad fraud, and how we worked and applied our model to overcome each of them. Read on to discover more.
Click Injection was first detected when we saw that some clicks seemed to be impossibly close to the install attributed to them. This showed up in ‘Click-To-Install-Time’ (or, CTIT) charts as a huge spike in activity early on when the data was visualized, and alerted researchers to the possibility that there might be ‘Spoofed Attributions’ within the data set. At that time it was unknown how those clicks were generated or where they came from.
Some in the industry derived an idea to create a filter from the type detection, and catch “impossible” CTITs. What this meant was that any install within a few seconds of a click would be rejected.
Simple in its application, unfortunately this solution didn’t cover the entire problem.
Adjust dug deeper, working backward to find a more foolproof filtering system. During our investigation, we found the function within Android that fraudsters use to generate these ‘injected clicks’. This is, essentially, using the notifications apps receive when another app is installed on the same device.
This discovery allowed us to identify better timestamps to filter at that first app open time. Instead of only filtering click injections when users open the app quickly after install we found a way to block up to 5X the number of attributions. You can read more about it in this article.
Further research even led to a joint project with Google that gave us access to a definitive yes/no filterable timestamp: the click on the “download” button.
The original approach taken by the market (that of setting an arbitrary mandatory CTIT length) is a good example of bailing out water from your boat instead of fixing the leak: it was an attempt to stop the fraud type without dealing with the actual method used to create it. While some systems went one way, we went deeper.
The most pervasive method that fraudsters use to claim the attributions of random users is to spam attribution providers like Adjust with a large number of clicks.
This way, they can randomly match device IDs (which are siphoned off from ad exchanges and other sources) or the fingerprints of legitimate installs. The last is an easy target, as iOS has very little variance between devices. Furthermore, saturating a market’s available IP addresses (the only other distinct attribute) isn’t too hard to do.
Click Spamming comes in many different forms, including stacked ads, background clicking on devices or straight up server-to-server click catalogs. Softer forms include sending views as clicks, or so-called “pre-caching”, where ads are clicked before they are shown. In the end, they all share the same trait: the user didn’t actually have any intention to interact with the advertisement and has no interest in downloading the app shown.
The main problem of filtering these ‘Spoofed Attributions’ is that the clicks themselves aren’t easy to recognize. Fraudsters can control any of their parameters, and can easily tweak their requests to appear like genuine ad engagements.
As we previously discussed, a good filter should leverage a logical fact that isn’t controlled by the attacker. Luckily, there is one thing in this scenario that attackers do not control by the very nature of the exploit: the time the user actually installs the app.
When looking at spoofed clicks vs. real clicks we see that real clicks show a strong correlation between the time a user was sent to the store and when they actually open the app for the first time. Usually this happens within the first hour - on average 80% of the time. For spoofed clicks we see no such correlation. Users opening the app appear to be randomly distributed across the attribution window.
A filter has to look at the statistical distribution for the lowest possible campaign segmentation, and see if clicks show any correlation to app installs. A big problem arrives by using fraud detection tools that show injected and spammed clicks in the same diagram, letting the user decide what fraud looks like.
By removing injected clicks from the equation, and penalizing high-frequency click spam, filtering can then be accomplished with reliability.
The key point here is to identify each method individually, and not to try to filter the type in general. Once we understand that there is a difference between injected clicks, low-frequency and high-frequency click spamming, then filtering each method becomes much simpler and doesn’t rely on app publishers to decide what to consider genuine or fraud.
When looking at the other type of fraud, ‘Spoofed Users’, we see a variety of methods used to create fake app installs and conversion events.
On Android, simulating a phone in the cloud is very straightforward, as it was intended to be done right from the conception of the OS. On the other hand, iOS makes simulating devices quite difficult, and so creating a large number of app activities is typically solved the old-fashioned way.
Enter device farms. Imagine a factory, with dozens of workers sitting in front of rows and rows of iPhones, and you get the idea.
At CPIs of $2, it’s a method that’s cheap enough to tap and install over and over and over again.
So how do you distinguish the intentions of these real devices with real people in front of them from users you want to acquire?
Many fraud prevention systems would be able to flag that those users simply do not retain and never purchase anything. However, the problem here is that most real users never do either - after all, day 1 retention for most app verticals is rarely above 30%. As long as these devices farms are mixed under real traffic it’s very hard to definitively tell what’s real, and what’s fake.
When looking closer at the routine of these fraudsters we can see that they permanently have to reset their device IDs to be counted as a new install. If we can find a way to persist certain information they cannot easily delete, without transmitting any PII, we can make it much harder for the attackers to win. For example, under iOS the Adjust SDK requires a full device reset to count as another install from a device, a process that takes over 15 minutes - vastly increasing turnaround times.
Another marker we can look at is the IP address used to send SDK requests. Without any masking or VPN, they simply show up as originating from countries like Vietnam or Thailand, which is easily filtered. Moving traffic via proxies or VPNs to more profitable markets like the US leaves a trace in form of IP addresses often registered to data centers. Those IPs are often found on commercially available lists that can be used to deny attribution. Using domestic IP addresses isn’t impossible but much slower and much more expensive, making this method of fraud less attractive.
These aren’t the only examples of fraud methods we’ve uncovered. SDK Spoofing (the latest method of ad fraud) was dealt with in a similar manner, and we’re always on the lookout for the what’s to come next, as there is always a bigger case to deal with.