This tutorial explains how to generate feature importance plots from pyrasgo using without needing to build machine learning models. The feature importance importance is calculated from SHAP values from catboost.
During this tutorial you will calculate the SHAP feature importance when predicting arrival delay for flights in and out of NYC in 2013.
This tutorial uses:
Open a new Jupyter notebook and import the following:
If you haven't done so already, head over to https://docs.rasgoml.com/rasgo-docs/onboarding/initial-setup and follow the steps outlined there to create your free account. This account gives you free access to the Rasgo API which will calculate dataframe profiles, generate feature importance score, and produce feature explainability for you analysis. In addition, this account allows you to maintain access to your analysis and share with your colleagues.
The data is from rdatasets imported using the Python package statsmodels.
As this model will predict arrival delay, the Null values are caused by flights did were cancelled or diverted. These can be excluded from this analysis.
Remove variables that are not of interest to this analysis with the exclude_columns parameter.
This will open another browser window with the feature importance and return to raw values in response.