This tutorial explains how to aggregate time series data in pandas using the gold and silver price data from rdatasets.

Packages

The documentation for each package used in this tutorial is linked below:

Open up a Jupyter notebook and import the following:


import statsmodels.api as sm
import pandas as pd

Create Initial Dataset

The data is from rdatasets imported using the Python package statsmodels.


df = sm.datasets.get_rdataset('GoldSilver', 'AER').data.reset_index().rename(columns={'index': 'date'})
df.info()

df['date'] = pd.to_datetime(df.date)

Time series aggregation

The pandas function resample can be used to create aggregations on specified windows. Here, a weekly aggregate of the daily gold and silver price data will be created.

First, a datetime index needs to be created from the date column.


df.set_index('date', inplace=True)

The parameter origin is used to specify the day of week to start the aggregation on. January 1, 1978 is a Sunday, so each of these aggregations will start on Sunday and aggregate over the subsequent week.


weekly_resample = df.resample('7D', origin=pd.to_datetime('1978-01-01'))
aggregated_df = weekly_resample.agg(['min', 'mean', 'max', 'std'])
aggregated_df.columns = ['_'.join(col).strip() + '_week' for col in aggregated_df.columns.values]

aggregated_df.head(20)

	gold_min_week	gold_mean_week	gold_max_week	gold_std_week	silver_min_week	silver_mean_week	silver_max_week	silver_std_week
date								
1977-12-25	100.00	100.000	100.00	NaN	223.42	223.420	223.42	NaN
1978-01-01	100.00	100.000	100.00	0.000000	223.42	226.604	229.84	2.596657
1978-01-08	100.88	101.586	102.62	0.796825	227.89	230.050	232.55	1.846334
1978-01-15	100.87	101.832	103.09	0.796976	227.96	229.686	232.49	1.755130
1978-01-22	102.49	103.306	103.62	0.475742	230.37	232.798	234.50	1.777630
1978-01-29	101.96	102.578	103.49	0.636530	226.91	229.116	231.61	1.936835
1978-02-05	101.90	102.572	103.15	0.505193	224.98	227.506	229.90	1.875361
1978-02-12	103.92	104.936	105.87	0.850312	228.16	231.506	234.01	2.412370
1978-02-19	105.14	105.956	107.23	0.822028	229.81	232.392	234.55	1.784550
1978-02-26	106.20	107.076	107.60	0.538451	229.83	231.092	232.22	1.043681
1978-03-05	108.89	109.916	111.03	0.791789	235.33	244.862	248.98	5.619695
1978-03-12	106.05	108.042	110.39	1.638740	248.09	251.940	254.95	2.875239
1978-03-19	102.65	104.458	105.53	1.355865	240.71	243.672	246.05	2.578317
1978-03-26	104.66	106.180	107.29	1.119710	246.81	249.628	253.55	2.764710
1978-04-02	103.33	103.968	105.27	0.771894	241.89	243.780	246.38	1.809627
1978-04-09	102.99	104.120	105.07	0.806908	241.23	243.816	247.27	2.410940
1978-04-16	99.11	99.996	100.93	0.852397	230.53	232.784	235.15	2.045527
1978-04-23	97.45	98.234	99.41	0.856756	225.99	228.588	230.49	1.848194
1978-04-30	98.38	99.778	100.92	1.143119	228.87	232.358	234.71	2.439912
1978-05-07	100.11	100.874	101.51	0.644306	232.08	233.124	234.55	0.905003

No-code/low-code data prep and visualization

Request Demo
Try for Free