dabl.plot.plot_classification_categorical

dabl.plot.plot_classification_categorical(X, target_col, types=None, kind='auto', hue_order=None, **kwargs)[source]

Plots for categorical features in classification.

Creates plots of categorical variable distributions for each target class. Relevant features are identified via mutual information.

For high cardinality categorical variables (variables with many categories) only the most frequent categories are shown.

Parameters
Xdataframe

Input data including features and target

target_colstr or int

Identifier of the target column in X

typesdataframe of types, optional.

Output of detect_types on X. Can be used to avoid recomputing the types.

kindstring, default ‘auto’

Kind of plot to show. Options are ‘count’, ‘proportion’, ‘mosaic’ and ‘auto’. Count shows raw class counts within categories (can be hard to read with imbalanced classes) Proportion shows class proportions within categories (can be misleading with imbalanced categories) Mosaic shows both aspects, but can be a bit busy. Auto uses mosaic plots for binary classification and counts otherwise.