This tutorial explains how to use the MinMax scaler encoding from scikit-learn. This scaler normalizes the data using just the minimum and maximum values of the feature to transform the feature to a value between 0 and 1.
This tutorial will data for flights in and out of NYC in 2013.
This tutorial uses:
Open up a new Jupyter Notebook and import the following:
The data is from rdatasets imported using the Python package statsmodels.
As this model will predict arrival delay, the Null values are caused by flights did were cancelled or diverted. These can be excluded from this analysis.
We convert the categorical features to numerical through the leave one out encoder in categorical_encoders. This leaves a single numeric feature in the place of each existing categorical feature. This is needed to apply the scaler to all features in the training data.
We apply the MinMax scaler from scikit-learn.
That should return a table resembling something like this:
Scale the test set. This can now be passed into the predict or predict_proba functions of a trained model.