Missing Data Imputation#

Feature-engine’s missing data imputers replace missing data by parameters estimated from data or arbitrary values pre-defined by the user.

The following table summarizes each imputer’s functionality:

Transformer

Numerical variables

Categorical variables

Description

MeanMedianImputer()

×

Replaces missing values with the mean or median

ArbitraryNumberImputer()

x

Replaces missing values with an arbitrary value

EndTailImputer()

×

Replaces missing values with a value at the end of the distribution

CategoricalImputer()

×

Replaces missing values with the most frequent category or an arbitrary string

RandomSampleImputer()

Replaces missing values with random value extractions from the variable

AddMissingIndicator()

Adds a binary variable to flag missing observations

DropMissingData()

Removes observations with missing data from the dataset