Categorical Encoding#
Feature-engine’s categorical encoders replace the categories of the variable with estimated or arbitrary numbers.
Summary of Feature-engine’s encoders characteristics
Transformer |
Regression |
Classification |
Multi-class |
Description |
---|---|---|---|---|
|
√ |
√ |
√ |
Adds dummy variables to represent each category |
|
√ |
√ |
√ Replaces categories with an integer |
|
|
√ |
√ |
√ |
Replaces categories with their count or frequency |
|
√ |
√ |
x |
Replaces categories with the targe mean value |
|
x |
√ |
x |
Replaces categories with the weight of the evidence |
|
√ |
√ |
√ Replaces categories with the predictions of a decision tree |
|
|
√ |
√ |
√ Groups infrequent categories into a single one |
Feature-engine’s categorical encoders encode only variables of type categorical or
object by default. From version 1.1.0, you have the option to set the parameter
ignore_format
to True to make the transformers also accept numerical variables as
input.
Other categorical encoding libraries#
For additional categorical encoding transformations, visit the open-source package Category encoders.