In this tutorial we will cover what a time series aggregate is, why it is important, a step-by-step example using SQL.
A time series aggregate is the aggregate (sum, average) of data points over a specific period of time. This is very important when looking at sales data or assigning value to a key performance indicator.
In more detail, when businesses talk about customer lifetime value (CLV or CLTV) rarely do they actually mean over the entire lifetime of a customer. When calculating the future value for a customer, rarely can you use an infinite amount of time. Thus, the future value is usually calculated either for a fixed time period or with a discounting factor that grows the further into the future you calculate.
Similarly, when calculating the current value, only the total value over a certain time period is considered. This is necessary to calculate the present value of customers with differing years and buying behavior. Otherwise, customers with decades-long histories will generally have significantly higher values than those who have only been customers for a few years.
In addition, when considering CLTV, companies are often interested in their current best customers. An absolute calculation of lifetime value would rank a customer that purchased $10,000 worth of product 10 years ago as more valuable than a customer who purchased $1,000 this year.
However, usually only the recent approximation of a customer’s value is meaningful to the business. Therefore, CLTV is calculated over a reasonably short period of time, ranging from a month to a year. To do this calculation, it is easiest to aggregate the value over a moving or rolling time window.
In this tutorial we will be using SQL to calculate the Lifetime Value of a customer using their buying behaviors (how much they purchased) over a period of time.
Simple rolling or moving calculations can be performed using SQL window functions, but these functions only allow windows to be defined over a set number of rows. Unless there is a known number of rows that is consistent across all customers and time periods, these calculations will not be correct.
Instead, intermediate calculations will be needed to define the appropriate time windows for calculations. These can be done as views, subqueries, or as common table expressions (CTEs). In this case, CTEs will be used to define the four-week and fifty-two-week CLTV for each customer in the AdventureWorks internet sales data.
By following the steps above you have combined the power of a Time Series Aggregate and SQL to find the CLTV from data directly in your cloud data warehouse. Please feel free to reach out to a member of our team if you have any additional questions.