In this tutorial, we will cover what active user analytics is, why it’s important, and a step-by-step example of how to build website performance analysis in SQL.
What is active user analysis?
Active user analysis allows the entire business to monitor the usage of the company's product. This includes tracking signups from free PLG products and paying customer users, understanding if those users return, the number of weekly (or daily/monthly) repeat users, and the total number of users and system usage. This information comes directly from the product analytics tool.
Why active user analysis is important?
Active user analysis is important across the entire business and the lifecycle of the user. Marketing teams track new user signups and trial signups to determine if their efforts are leading to prospects engaging with the product. Identifying new users who become repeat users is key for marketing to identify promising leads which can then be handed over to sales for greater engagement.
Sales teams use these repeat users to target their initial sales outreach and hopefully generate qualified leads. They can then track users as they start their trial (if appropriate) and when they finish the sale, check the new user signups to confirm product adoption.
Additionally, the account management team can follow enterprise account signups as they work to drive adoption and additional revenue. Checking for repeat enterprise use will be a major metric to track when measuring adoption. By tracking the enterprise use, an account team can determine if an account is decreasing its use and becoming a churn risk.
Finally, the product team can track all of these to understand how the product is being used and identify friction that may cause trouble with sales leads or adoption. In addition, understanding the trends in the number of users and usage can help forecast future capacity.
Data needed for active user analysis
In order to analyze user activity, the following data is needed.
- Product Measurement
- Information on
- Number of users
- User signups
- Repeat users
- Source
- Heap
- Amplitude
- Segment
- Snowplow
How to perform active user analysis in SQL?
In this section, we will show how to build the active user analysis using SQL for weekly user metrics. We will assume the data comes from Heap, but you might be getting this data from Pendo, Amplitude, or other product measurement sources. We will need to clean this dataset, calculate the information weekly for multiple metrics and join the metric data back together.
Free (PLG) User Metrics
The key metrics here are new free PLG users and users who come back to use the product multiple times. In particular, we are interested in users who come back at least a week after they first joined (trial engaged customers or product qualified leads). We can generate these from the daily active users data in Heap.
To calculate the new users each week (and determine if they come back a week later), we first aggregate each user’s weekly activity. In addition, as we are looking at truly new users, not new users on an enterprise account, we will first exclude those users with an organization_type of ‘ENTERPRISE’.
WITH dailyplg AS (
SELECT DATETRUNC(week, date) AS week,
username,
hoursactive
FROM daily_active_users
WHERE organization_type != ‘ENTERPRISE’),
weeklyplguser AS (
SELECT week,
username,
SUM(hoursactive) AS hoursactive
FROM dailyplg
GROUP BY week, username),
weeklyplgrepeat AS (
SELECT week,
username,
lag(week, -1) OVER(PARTITION BY username ORDER BY week) AS repeat_week
FROM weeklyplguser),
newplguser AS (
SELECT username
MIN(week) AS week,
MIN(repeat_week) AS repeat_week
FROM weeklyplgrepeat
GROUP BY username),
weeklyplgrepeatindicator AS (
SELECT week,
username,
CASE WHEN repeat_weak IS NOT NULL THEN 1 ELSE 0 AS repeat_user
FROM newplguser)
SELECT week,
COUNT(username) AS new_plg_users,
SUM(repeat_user) AS new_plg_users_will_engage
FROM weeklyplgrepeatindicator
GROUP BY week
Similarly, if we only want to know how many users are repeat users each week, we can build off of the CTE weekly repeat by aggregating the number of users by repeat_week.
SELECT repeat_week AS week,
COUNT(username) AS weekly_returning_plg_users
FROM weeklyplgrepeat
GROUP BY repeat_week
Customer User Metrics
This same analysis can be performed for customers by changing the organization_type to select enterprise by changing the first WHERE clause to
WHERE organization_type = ‘ENTERPRISE’
Total Usage
In addition to the breakdowns by free and paying users, it is useful to track the total number of users and hours used each week. These can also be calculated from the Heap active daily users table.
WITH dailyusage AS (
SELECT DATETRUNC(week, date) AS week,
username,
hoursactive
FROM daily_active_users),
weeklyuser AS (
SELECT week,
username,
SUM(hoursactive) AS hoursactive
FROM dailyusage
GROUP BY week, username)
SELECT week,
COUNT(username) AS weekly_users,
SUM(hoursactive) AS weekly_usage
FROM weeklyuser
GROUP BY week
Final Metrics
All of these blocks can be put together and joined on the week column to create the full usage metrics by week.
WITH dailyplg AS (
SELECT DATETRUNC(week, date) AS week,
username,
hoursactive
FROM daily_active_users
WHERE organization_type != ‘ENTERPRISE’),
weeklyplguser AS (
SELECT week,
username,
SUM(hoursactive) AS hoursactive
FROM dailyplg
GROUP BY week, username),
weeklyplgrepeat AS (
SELECT week,
username,
lag(week, -1) OVER(PARTITION BY username ORDER BY week) AS repeat_week
FROM weeklyplguser),
newplguser AS (
SELECT username
MIN(week) AS week,
MIN(repeat_week) AS repeat_week
FROM weeklyplgrepeat
GROUP BY username),
weeklyplgrepeatindicator AS (
SELECT week,
username,
CASE WHEN repeat_weak IS NOT NULL THEN 1 ELSE 0 AS repeat_user
FROM newplguser),
weeklyplgusage AS (
SELECT week,
COUNT(username) AS new_plg_users,
SUM(repeat_user) AS new_plg_users_will_engage
FROM weeklyplgrepeatindicator
GROUP BY week),
weeklyplgrepeats AS (
SELECT repeat_week AS week,
COUNT(username) AS weekly_returning_plg_users
FROM weeklyplgrepeat
GROUP BY repeat_week),
dailyent AS (
SELECT DATETRUNC(week, date) AS week,
username,
hoursactive
FROM daily_active_users
WHERE organization_type = ‘ENTERPRISE’),
weeklyentuser AS (
SELECT week,
username,
SUM(hoursactive) AS hoursactive
FROM dailyent
GROUP BY week, username),
weeklyentrepeat AS (
SELECT week,
username,
lag(week, -1) OVER(PARTITION BY username ORDER BY week) AS repeat_week
FROM weeklyentuser),
newentuser AS (
SELECT username
MIN(week) AS week,
MIN(repeat_week) AS repeat_week
FROM weeklyentrepeat
GROUP BY username),
weeklyentrepeatindicator AS (
SELECT week,
username,
CASE WHEN repeat_weak IS NOT NULL THEN 1 ELSE 0 AS repeat_user
FROM newentuser),
weeklyentusage AS (
SELECT week,
COUNT(username) AS new_ent_users,
SUM(repeat_user) AS new_ent_users_will_engage
FROM weeklyentrepeatindicator
GROUP BY week),
weeklyentrepeats AS (
SELECT repeat_week AS week,
COUNT(username) AS weekly_returning_ent_users
FROM weeklyentrepeat
GROUP BY repeat_week),
dailyusage AS (
SELECT DATETRUNC(week, date) AS week,
username,
hoursactive
FROM daily_active_users),
weeklyuser AS (
SELECT week,
username,
SUM(hoursactive) AS hoursactive
FROM dailyusage
GROUP BY week, username),
weeklyusage AS (
SELECT week,
COUNT(username) AS weekly_users,
SUM(hoursactive) AS weekly_usage
FROM weeklyuser
GROUP BY week)
SELECT week,
weekly.weekly_users AS weekly_users,
weekly.weekly_usage AS weekly_usage,
plg.new_plg_users AS new_plg_users,
plg.new_plg_users_will_engage AS new_plg_users_will_engage,
plgrepeat.weekly_returning_plg_users AS weekly_returning_plg_users,
ent.new_ent_users AS new_ent_users,
ent.new_ent_users_will_engage AS new_ent_users_will_engage,
entrepeat.weekly_returning_ent_users AS weekly_returning_ent_users
FROM weeklyusage AS weekly
JOIN weeklyplgusage AS plg
ON weekly.week = plg.week
JOIN weeklyplgrepeats AS plgrepeat
ON weekly.week = plgrepeat.week
JOIN weeklyentusage AS ent
ON weekly.week = ent.week
JOIN weeklyentrepeats AS entrepeat
ON weekly.week = entrepeat.week
Conclusion
Active user analysis is key for executives from marketing, sales, account management, and product management to understand how the business is functioning and how users are using the product. This information is necessary to ensure that the sales pipeline remains full, new customers continue to be converted and existing customers continue to renew.