What is SDK spoofing?
SDK spoofing (or replay attacks) is a form of mobile performance fraud that consumes an advertiser’s budget by generating legitimate-looking installs without any real installs occuring. Fraudsters utilize a real device without the device’s user actually installing an app. This type of fraud evolved very quickly and dramatically during the course of 2017. SDK spoofing has become harder to spot than fake installs generated in emulation or install farms, as the devices used in this scheme are real and therefore normally active and spread out.
How it all began
The main approach of the perpetrators here was to break open the SSL encryption between the communication of a tracking SDK and its backend servers, typically done by performing a ‘man in the middle attack’ (MITM attack). The most popular approach is to use a proxy software (e.g. Charles Proxy), which enables this type of attack with a simple press of a button.
After completing the MITM attack, fraudsters would then generate a series of test installs for an app they want to defraud. Since they can read the URLs in clear text format for all the server-side connections, they can learn which URL calls represent specific actions within the app, such as first open, repeated opens, and even different in-app events like purchases, levels up or anything being tracked. They also research which parts of these URLs are static and which are dynamic, keeping the static parts (things like shared secrets, event tokens, etc) and experimenting with the dynamic parts, which include things like advertising identifiers or other data specific to the device and the particular circumstances.
Now, thanks to callbacks and near real-time communication detailing the success of installs and events, the perpetrators can test their setup by simply creating a click and a matching install session. If the install doesn’t go through, then there is a mistake in their URL logic. If it is successfully tracked, they know they’ve nailed the logic. It’s simple trial and error; with only a couple dozen variables, that process becomes easier to understand the longer the experiment lasts.
Once an install is successfully tracked, the fraudsters will have figured out a URL setup that allows them to create installs from thin air.
It’s important to note that during the early stages of this fraud scheme’s rise to notoriety, the level of sophistication and understanding of a URL structure was low; therefore, spoofing attempts were more easily spotted and blocked. Calls would come from data centers or VPNs, and data was oftentimes nonsensical or the URL parameters were filled with data that did not match the intended purpose.
How SDK spoofing attacks became more sophisticated
We’ve been fighting fraud for a while. We know that fraudsters are continuously improving their own methods with every filter we release, so in parallel to our hotfixes we were working hard on a longer-term solution, one that would thwart any further attempts to deceive and defraud customers (more on this below). Just as we expected, the fraudsters eventually did figure out why their fake calls were blocked and they did indeed step up their game. After a couple of these cycles, we lost the advantage of being able to identify the faulty traffic through mismatches in traffic data and transported data.
At the very first signs of this type of fraud, we began recording, researching, and immediately took defensive steps. The easiest and fastest way for us to take action in the short term was to release hotfixes to our attribution, removing spoofed install data based on faulty use of our parameter structure and data that did not match the intended purpose (as outlined above)
This is when the real evolution happened. The fraudsters had a lot to lose, so they really pushed the bar in terms of their level of sophistication. Fraudulent device data started to match data from real-device traffic and became consistent over a multitude of device-based parameters (and, later, all device-based) parameters. How was this possible, if everything was fake?
The simple yet stunning answer was (and still is) that not everything is fake anymore. The fraudsters started to collect real device data. They do this by using their own apps or by leveraging any app they have control over. The intent of their data collection is, of course, malicious, but that does not mean that the app being exploited for data is purely malicious or could even be found out as malicious. The perpetrator’s app might have a very real purpose or it might be someone else’s legit app and the perpetrators simply have access to it by means of having their SDK integrated within it.
This could be any type of SDK from monetization SDKs to any closed-source SDK where the information being collected isn’t transparent. Regardless of the specific circumstances, the fraudsters have access to an app that is being used by a (favorably) large amount of users.
Having a source (or even multiple sources) that generates real device data makes the fraudsters' task a lot simpler. They no longer need to randomize or curate troves of data, because now they have access to the real thing. This has made it incredibly hard on the anti-fraud side to research and identify these spoofing attempts.
To make matters worse, this giant leap in the evolution of fraudsters went hand in hand with a second and equally impactful step in the sophistication of SDK spoofing. The URLs were no longer called from data centers anymore, or tunneled through VPNs. Instead, they were proxied directly through the app the perpetrator had access to on a device of an unsuspecting user.
For the non-techies out there, this means the fraudster’s server runs a script that automatically creates a URL that will trigger us (or any attribution company) to track an install or event. Instead of sending this URL directly to our servers (or through an anonymizing network as they used to) the fraudsters now send it to the app (the one the perpetrators have access to) on a user’s device. This app now executes the URL on the user’s device.
You can see how this method makes it look like the connection came from a real device (and, for that matter, a device that matches the transported data) because it does! The connection is real, the device data is real, the device is real. It’s bad enough that there’s no interaction between the user and any advertising for the advertised app, but the even bigger issue is that there’s no actual install.
How we solved SDK spoofing
As a result of this drastic and quickly-evolving scheme, we attributed these fake installs to equally fake ad-engagements, which resulted in some of our clients being defrauded. Releasing hotfixes to stop this threat became increasingly difficult. In some radical cases, we had to manually research hundreds of thousands of data points to prove that these installs were in fact fake, giving our clients a chance to recuperate their lost budgets. Throughout this time though, we worked on a solution that would put a stop to this fraud scheme dead in its tracks.
We considered several solutions (like certificate pinning), or creating a checksum hash for each app and SDK integration. We also evaluated building our own encryption method, but all of these ideas fell flat due to the potentially negative impact to our clients’ apps’ CX/UX and the potential risk to tracking quality in general. For instance, certificate pinning can lead to major obstructions over time as certificates become outdated or deprecated for various reasons (e.g. Comodo’s security breach of 2011). As plenty of apps stop receiving updates after a period of time and sometimes whole development teams get reassigned or disbanded, the risk is that certificates will become outdated and that tracking will stop completely for these apps. Another consideration here was that pinning certificates would disable all common testing suites that clients and networks use for tracking tests. Finally, the client (in this case the mobile device) can decide not use the pinned certificate, so while certificate pinning is a great way to secure a client server communication from a MITM attack, we determined it was not sufficient to secure a server from a hostile client.
In the end, we decided to create a signature hash to sign SDK communication packages. This method ensures that replay attacks do not work, as we introduced a new dynamic parameter to the URL which cannot be guessed or stolen and is only ever used once. In order to achieve a reasonably secure hash and an equally reasonable user experience for our clients, we opted for an additional shared secret, which will be generated in the dashboard for each app the client wants to secure. Marketers also have the opportunity to renew secrets and use different ones for different version releases of their app. This allows them to deprecate signature versions over time, making sure that attribution is based on the highest security standard for the newest releases and the older releases can be removed from attribution fully.
This solution became available with the release of SDK version 4.12 and is now available to all clients regardless of their use of our paid Fraud Prevention Suite.
SDK spoofing is just one of the ways fraudsters can sabotage your campaigns. To learn more the types of ad fraud which exist in mobile marketing, get our mobile fraud guide here.