This tutorial explains how to create a moving average in pandas using the gold and silver price data from rdatasets.
Packages
The documentation for each package used in this tutorial is linked below:
Open up a Jupyter notebook and import the following:
import statsmodels.api as sm
import pandas as pd
Create initial dataset
The data is from rdatasets imported using the Python package statsmodels.
df = sm.datasets.get_rdataset('GoldSilver', 'AER').data.reset_index().rename(columns={'index': 'date'})
df.info()
df['date'] = pd.to_datetime(df.date)
Moving Average
To create a moving average, a rolling window first needs to be created using the pandas function rolling. Then any aggregation function, sum, mean, std, etc.
df.sort_values('date', inplace=True)
df['silver_moving_average_5'] = df['silver'].rolling(5).mean()
your dataframe should look like this:
date gold silver silver_moving_average_5
5 1978-01-06 100.00 227.19 226.604
6 1978-01-09 101.23 229.62 227.844
7 1978-01-10 100.95 228.97 227.670
8 1978-01-11 102.25 231.22 228.998
9 1978-01-12 100.88 227.89 228.978
Alternatively, a rolling window could be created and multiple aggregations applied to it.
rolling_5 = df.rolling(5)
df['silver_moving_average_5_two'] = rolling_5['silver'].mean()
df['silver_moving_sum_5'] = rolling_5['silver'].sum()
date gold silver silver_moving_average_5 silver_moving_average_5_two silver_moving_sum_5
5 1978-01-06 100.00 227.19 226.604 226.604 1133.02
6 1978-01-09 101.23 229.62 227.844 227.844 1139.22
7 1978-01-10 100.95 228.97 227.670 227.670 1138.35
8 1978-01-11 102.25 231.22 228.998 228.998 1144.99
9 1978-01-12 100.88 227.89 228.978 228.978 1144.89
This creates a moving average based on the last five observations, but rolling can also take an offset to specify the time to use in the rolling window. For example, 5D can be used as an offset for 5 days.
First, a datetime index must be created from date.
df.set_index('date', inplace=True)
rolling_5d = df.rolling('5D')
df['silver_moving_average_5D'] = rolling_5d['silver'].mean()
df['silver_moving_sum_5D'] = rolling_5d['silver'].sum()
gold silver silver_moving_average_5 silver_moving_average_5_two silver_moving_sum_5 silver_moving_average_5D silver_moving_sum_5D
date
1978-01-06 100.00 227.19 226.604 226.604 1133.02 226.604000 1133.02
1978-01-09 101.23 229.62 227.844 227.844 1139.22 228.266667 684.80
1978-01-10 100.95 228.97 227.670 227.670 1138.35 228.593333 685.78
1978-01-11 102.25 231.22 228.998 228.998 1144.99 229.936667 689.81
1978-01-12 100.88 227.89 228.978 228.978 1144.89 229.425000 917.70