The power of integration: How Smule uses Adjust to enrich their data
Posted Nov 14, 2018
During this year’s second Mobile Spree conference, Smule’s Director of Data Engineering Andrew Platt outlined how the App Store’s #1 singing app uses Adjust to make data more meaningful.
At the conference, we heard several great talks, from building trusted partnerships to navigating user acquisition. Among our expert line-up was Andrew, who wanted to explain how Adjust enables his team to side-step common issues felt throughout the industry while optimizing Smule’s true potential. Scroll on for a summary of his talk or watch the video below.
Enriching data through integration
Early into his talk, Andrew pointed out that, if you have a data science team, you can run into a number of issues if your data isn’t fully integrated. This essentially means you have to combine data from more than one source in order to make it more meaningful.
An unfortunate truth behind this is that, nowadays, most data scientists currently spend 20% of their workday on real data analysis. The majority of their time is instead spent acquiring, cleaning and organizing masses of data - which seems like an awful waste of their expertise. However, Smule proactively responded to this issue by utilizing Adjust as a tool for data integration.
“Even though more and more data is available to you every day, it’s not always in a usable form. We spend a lot of time reconciling why we get two different numbers from two different systems. To me, the problem is one of data integration, and [today] I’m going to go through a number of ways we’ve leveraged the Adjust data platform. I don’t see it as an attribution system, I see it as a system to connect other things together.”
Speaking from his experiences of Adjust’s toolset, Andrew provided three examples of how integrated data has been the key to Smule’s success, by:
- Empowering UAs to become more self-sufficient
- Enhancing and integrating data
- Improving data quality
Let’s take a closer look at the benefits discussed by Andrew and, most importantly, how they were achieved.
Self-sufficiency: Empowering UAs
When Smule began receiving global callbacks from Adjust, this data wasn’t initially a priority for his UA team. But, since then it has become a routine asset to their daily workload.
“We take the data pipeline from Adjust. [...] Now that we have that, the UA team has their own desktop version, they create their own dashboard, they have their own project on the server [...] We added it as a sort of side function, because at that point it’s really not integrated, but it’s become something they rely on.”
Within minutes, Andrew’s team can see how each campaign is currently performing, empowering them to be more self-sufficient. But with the UA team’s access to real-time callbacks, how did integration lead to enriched data for his team to analyze?
As the App Store's #1 singing app, spotting any streaming issues as early as possible is incredibly important to Smule. Whether it stems from an Android phone without the correct video or audio codec, a limited bandwidth or otherwise, Smule realized that geodata was needed for every event - not just every session - in order to accurately identify the problem.
But this posed a scalability issue. Fortunately, by using Adjust’s dynamic callback parameters, Smule are able to gain the geodata they need to quickly identify a problem’s source.
“Through Adjust’s SDK, we send the events through Adjust. We put on a unique ID, and we send that ID to our internal system using dynamic callback parameters. That allows us to match data on the backend. ... For all those events that fall within a 10-second range, we can interpolate and assign the IP information, the geo and carrier information. That way, we can troubleshoot much more quickly when we see certain events.”
As well as enhancing their data with dynamic callback parameters, Andrew explained how he was able to use integrated data to improve data quality when trying to spot ad fraud. In particular, he showed how Smule has used Adjust’s timestamps - which show the specific time of various user actions - to detect fraudulent installs.
“We use Adjust’s fraud filter but that system, rightly so, is very deterministic. And sometimes you’ll see patterns in the data that look like something, but you can’t actually say for sure it’s fraud. ...You should, on those callbacks, use all the timestamps you can get, and … if you plot a distribution of the time between click and install, you’d see a pattern for most networks.”
By using this data, Andrew was able to see that a particular network’s installs were likely to be fraudulent because their time to install differed from every other network.
“Having this data means that you can spot patterns to enhance what you’re already getting through the fraud protection system at Adjust. So that timestamp is a useful metric.”
As the talk came to a close, Andrew once again underlined the importance of usable data. After all, the quantity of data is a meaningless statistic if it can’t be appropriately analyzed. With this in mind, note that - among other things - integrated data offers a greater understanding of user activity, fraud, and data accuracy.