This tutorial explains how to use low variance to remove features in scikit-learn. This will work with an OpenML dataset to predict who pays for internet with 10108 observations and 69 columns.
This tutorial uses:
The data is from OpenML imported using the Python package sklearn.datasets.
Split the data into target and features.
Encode the categorical variables prior to feature selection.
Start with 63 features
Select the those features with a variance greater than .0025.
The function get_support can be used to generate the list of features that were kept.