App Marketing

Demystifying Cohort KPIs, part I: Understanding retention and sessions

Nikola Chochkov
Data Scientist
Topics

Cohort analysis is an extremely useful tool for understanding what happens to the behaviour of users in your app when you’ve made a change or an optimization, whether that’s on the product side or in your marketing outreach. A proper analysis allows you to quickly and clearly identify trends and changes in the way users react and interact with your product. It can reveal changes that a standard analysis would hide, saving your skin simply by looking at your data in a different way.

When you’re practicing cohort analysis, you need a systematic way of creating cohorts and a set of metrics (or KPIs) that you can use to compare your cohorts. The KPIs we offer on our cohort analysis are central to understanding your app’s performance in terms of user retention, revenue, event conversion, and further key aspects of the health of your products.

This article is the first in a series explaining our cohort KPIs and and how to interpret them. Over the next few weeks, we’d like to shine a light on cohort analysis and how to maximize your app’s performance with your cohort data.


Understanding Adjust cohorts

The first step to a successful analysis is establishing the questions that you’d like to answer. Do you want to compare the performance of acquisition channels? Monitor the results of changes to your ad targeting or creatives? More broadly, follow the behaviour over time of your users based on their install dates? These are questions that are typically answered with data pulled from your cohorts.

For the sake of comparison, in a “standard” analysis, we track everything that happens in an app, and aggregate this based on which day it happened, and sometimes based on other segmentation criteria. For example, a standard analysis of revenue flows would tell you how many purchases were made on September 28th. In a cohort analysis, we also take into account the date-of-install for each user, and then compute metrics based on day-after-install - speaking of purchases made on the second or third day after install, for example.

We might still very well segment based on dates, but it’s only one of multiple segmentations we might do. You can segment users into cohorts based on their acquisition channel or date of install, and then apply the following parameters:

  • Install-date range: the specific time period for installs
  • Country: where the user was first active
  • Platform: Android, iOS, etc.
  • Device type: phone, tablet, etc.

An acquisition channel, in this context, is anything that you’ve tagged with an adjust tracker URL. The trackers will automatically assign a user and any conversion to a source, like an ad campaign, a website, a post or anything else where you can fit a URL. These trackers are subdivided in up to four levels: “network”, “campaign”, “adgroup”, and “creative”. Choosing tracker segmentation lets you to drill down to the nitty gritty for more effective comparison.


Understanding Metrics

In this article we’ll look at user retention-related KPIs, or the KPIs that you can apply to answer questions about how many users stuck around after installing your app. If you’re curious, you can check out the docs on our Cohort API here.

So, say that we’ve used the segmentation above to define a cohort. The first thing we might do to describe this cohort is to ask how big it is. That is, how many users are there in this segmentation, or more accurately, in this cohort?


Cohort Size

The cohort size is the base of many of the calculations and KPIs that follow. It’s therefore crucial to understand how this is calculated.

Suppose today is Monday, and that we want to look at the users who installed an app last week (ending with yesterday, on Sunday). So as to make this a little easier, we’ll even name them. Let’s say Alice installed our app on Tuesday, Bob on Friday and Charley on Sunday (the last day).

Remember, in a cohort analysis, we look at behaviour on a given day after install. The first thing we do is set each users’ install date as day zero. Any metric we compute after this is based on how many days have passed for each user after their install.

When we talk about cohort size, we mean the cohort size at a particular day-after-install. In our example, the size of our cohort at day 0 after install would be the full number of users: 3. The number remains at day 1 after install, but at day 2 after install we exclude Charley, since he only installed yesterday.

Thus, although his install day falls inside the time range for our cohort (last week), Charley still hasn’t had 2 days-after-install, and so he won’t be counted in the behaviour analysis for users who are 2 days-after-install. Similarly, Bob will be excluded from the analysis of users who are 4 days-after-install, because he installed only 3 days ago.

Here is the full table:

Segment \ Day After Install 0 1 2 3 4 5 6 7
Last week’s users 3 3 2 2 1 1 1 0
Cohort Size for last week's cohort


Note that since our first install for last week was on Tuesday, as of Monday this week no user could contribute to analysis of the 7 days-after-install behaviour for that cohort. This is why you see the 0 value for the cohort size on 7 days-after-install in the table above.


Retained Users

So we’ve established the cohort size, i.e. the potential number of users of the app. But how many of them have actually been using the app on a given day? To put it differently, how often do they return to the app, or how many users have we retained?

This is what’s referred to as retention. Let’s look at the retention situation in our example cohort. First, let’s segment it by day of the week, to make it clearer.

DOW \ DAI 0 1 2 3 4 5 6 7
Mon 0 0 0 0 0 0 0 0
Tue 1 0 1 0 1 0 0 na
Wed 0 0 0 0 0 0 na na
Thu 0 0 0 0 0 na na na
Fri 1 1 1 0 na na na na
Sat 0 0 0 na na na na na
Sun 1 0 na na na na na na
Retained Users segmented by Install Day


Charley is the only user who installed on Sunday, just as Bob is the only user who installed on Friday. Charley didn’t return to the app on the first day after install, while Bob did. Again, assuming that all three of our three users came from the same tracker, then the tracker segmentation will resemble this:

Segment \ DAI 0 1 2 3 4 5 6 7
Last week’s users 3 1 2 0 1 0 0 0
Retained Users for last week's cohort


Note that (as expected) at day 0 after install, this table shows all 3 installs in the cohort. Clearly, if they installed the app, they’ve been active on that day.


Retention Rate

Now that we’ve established the cohort size and retentions in our cohort, let’s interest ourselves in estimating our retention rate - the percentage of users that retained on a given day-after-install.

We’d simply derive a table by dividing the above two tables:

Tracker \ DAI 0 1 2 3 4 5 6 7
Last week’s users 3/3 1/3 2/2 0/2 1/1 0/1 0/1 0/0
User Retention Rate for last week's cohort


As expected, we have a 100% retention rate on the install day. For each day-after-install, we divide retained users by the cohort size for that specific day-after-install.


Sessions

Now that we know when our users returned to the app, let’s find how many sessions they triggered by day-after-install.

DOW \ DAI 0 1 2 3 4 5 6 7
Mon 0 0 0 0 0 0 0 0
Tue 2 0 1 0 3 0 0 na
Wed 0 0 0 0 0 0 na na
Thu 0 0 0 0 0 na na na
Fri 4 1 2 0 na na na na
Sat 0 0 0 na na na na na
Sun 1 0 na na na na na na
Sessions segmented by Install Day


Bob, for example, installed on Friday and triggered 4 app sessions on the same day, 1 on the next day and 2 yesterday. Sum the rows up again to arrive at the total sessions per day from this cohort.

Segment \ DAI 0 1 2 3 4 5 6 7
Last week’s users 7 1 3 0 3 0 0 0
Sessions for last week's cohort


Note that our cohort is incomplete (i.e. some users in this cohort don’t contribute to all days-after-install), so this data will likely change if we look at the same cohort again later. As time passes, more and more users complete the cohort. If we wait another seven days, all users will have completed the timespan in question.


Sessions Per User

To estimate how many sessions users have on each day-after-install, we divide the last table by the Retained Users table.

Segment \ DAI 0 1 2 3 4 5 6 7
Last week’s users 7/3 1/1 3/2 0/0 3/1 0/0 0/0 0/0
Sessions Per Users last week's cohort


Note that we’re dividing by the retained users this time, as opposed to the cohort size. Depending on the metric, it sometimes makes more sense to divide by the total cohort size, and sometimes by the number of retained users. In the case of sessions-per-user, an inactive user could not have triggered any sessions.

Had we divided the number of sessions by the cohort size, we would have conflated the sessions-per-user metric with retention rate. That is, if the retention rate in a given cohort was low, the sessions per user would also have been lowered. By dividing by the number of retained users, we can look at sessions-per-user separately from the retention rate, thus keeping the two metrics separate.

Again, this situation will probably change throughout the course of the week as more and more of the users complete the cohort and add contributions to the right hand columns.


Moving forward

In this post, we wanted to cover the basics of cohort analysis and some initial metrics. There’s a lot more ground to cover in cohort analysis. In the next parts of the Demystifying Cohort KPIs blog series we’ll cover revenue KPIs, event-based KPIs and lots more. You can go straight to part II now: Constructing crystal-clear revenue analysis with cohorts.