.. _power: .. currentmodule:: feature_engine.transformation PowerTransformer ================ The :class:`PowerTransformer()` applies power or exponential transformations to numerical variables. Let's load the house prices dataset and separate it into train and test sets (more details about the dataset :ref:`here `). .. code:: python import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine import transformation as vt # Load dataset data = data = pd.read_csv('houseprice.csv') # Separate into train and test sets X_train, X_test, y_train, y_test = train_test_split( data.drop(['Id', 'SalePrice'], axis=1), data['SalePrice'], test_size=0.3, random_state=0) Now we want to apply the square root to 2 variables in the dataframe: .. code:: python # set up the variable transformer tf = vt.PowerTransformer(variables = ['LotArea', 'GrLivArea'], exp=0.5) # fit the transformer tf.fit(X_train) The transformer does not learn any parameters. So we can go ahead and transform the variables: .. code:: python # transform the data train_t= tf.transform(X_train) test_t= tf.transform(X_test) Finally, we can plot the original variable distribution: .. code:: python # un-transformed variable X_train['LotArea'].hist(bins=50) .. image:: ../../images/lotarearaw.png And now the distribution after the transformation: .. code:: python # transformed variable train_t['LotArea'].hist(bins=50) .. image:: ../../images/lotareapower.png More details ^^^^^^^^^^^^ You can find more details about the :class:`PowerTransformer()` here: - `Jupyter notebook `_ For more details about this and other feature engineering methods check out these resources: - `Feature engineering for machine learning `_, online course. - `Python Feature Engineering Cookbook `_, book.