Version 1.6.X ============= Version 1.6.2 ------------- Deployed: 18th September 2023 Contributors ~~~~~~~~~~~~ - `Giorgio Segalla `_ - `David Cortes `_ - `Kyle Gilde `_ - `Darigov Research `_ - `Soledad Galli `_ New functionality ~~~~~~~~~~~~~~~~~ - `MatchVariables()` can now also match the **dtypes** of the variables (`Kyle Gilde `_) - `DatetimeFeatures()` and `DatetimeSubtraction()` can now specify the format of the datetime variables (`Soledad Galli `_) - Add `inverse_transform` method to `YeoJohnsonTransformer()` (`Giorgio Segalla `_) Bug fixes ~~~~~~~~~ This bugs were introduced by the latest releases of pandas, Scikit-learn and Scipy. - Fix failing test for `YeoJohnsonTransformer()` (`Soledad Galli `_) - Fix failing test for `RareLabelEncoder()` (`Soledad Galli `_) - Fix failing test for `DatetimeFeatures()` (`Soledad Galli `_) - Fix failing test for many encoders: removed `downcast=infer` as it will be deprecated (`Soledad Galli `_) - Fix version related failing style checks (`Soledad Galli `_) - Fix version related failing type checks (`Soledad Galli `_) - Fix version related failing doc checks (`Soledad Galli `_) - Fix future warning categorical imputation (`Soledad Galli `_) Code improvements ~~~~~~~~~~~~~~~~~ - Routine in `DatetimeFeatures()` does not enter into our check for `utc=True` when working with different timezones any more (`Soledad Galli `_) - Improve performance in `OneHotEncoder()` (`Soledad Galli `_) - Add check for dupicated variable names in dataframe (`David Cortes `_) Documentation ~~~~~~~~~~~~~ - Fix various typos in user guide (`Soledad Galli `_) - Update readthedocs.yml file (`Soledad Galli `_) - Add link to license in Readme (`Darigov Research `_) Version 1.6.1 ------------- Deployed: 8th June 2023 Contributors ~~~~~~~~~~~~ - `dlaprins `_ - `Claudio Salvatore Arcidiacono `_ - `Morgan Sell `_ - `Gleb Levitski `_ - `Soledad Galli `_ In this release, we make Feature-engine compatible with pandas 2.0, extend the functionality of some transformers, and we fix bugs introduced in the previous release. Thank you so much to all contributors, `Gleb Levitski `_ and `Claudio Salvatore Arcidiacono `_ for helping with review and to those of you who created issues flagging bugs or requesting new functionality. New functionality ~~~~~~~~~~~~~~~~~ - The Population Stability Index can now be used to evaluate categorical variables (`dlaprins `_ and `Claudio Salvatore Arcidiacono `_) - `RelativeFeatures` has the option to add a constant to avoid dividing by zero (`Morgan Sell `_ and `Soledad Galli `_) - `SelectByShuffling` now accepts sample weights (`Soledad Galli `_) - `WoEEncoder` now let's you know which variables fail in the encoding (`Soledad Galli `_) - `WoEEncoder` has the option to add a constant to avoid dividing by zero (`Soledad Galli `_) Bug fixes ~~~~~~~~~ - Fixed various bugs in `RareLabelEncoder()` (`Soledad Galli `_) - Renamed `transform` method in base classes to `check_transform_input_and_state`, which fixed bugs raised when `set_output(transform="pandas")` in various classes (`Soledad Galli `_ and `Claudio Salvatore Arcidiacono `_) Code improvements ~~~~~~~~~~~~~~~~~ - Made code base compatible with pandas 2.0 (`Claudio Salvatore Arcidiacono `_) - Moved docstrings of selection transformers to docstrings module (`Soledad Galli `_) Version 1.6.0 ------------- Deployed: 16th March 2023 Contributors ~~~~~~~~~~~~ - `Gleb Levitski `_ - `Morgan Sell `_ - `Alfonso Tobar `_ - `Nodar Okroshiashvili `_ - `Luís Seabra `_ - `Kyle Gilde `_ - `Soledad Galli `_ In this release, we make Feature-engine transformers compatible with the `set_output` API from Scikit-learn, which was released in version 1.2.0. We also make Feature-engine compatible with the newest direction of pandas, in removing the `inplace` functionality that our transformers use under the hood. We introduce a major change: most of the **categorical encoders can now encode variables even if they have missing data**. We are also releasing **3 brand new transformers**: One for discretization, one for feature selection and one for operations between datetime variables. We also made a major improvement in the performance of the `DropDuplicateFeatures` and some smaller bug fixes here and there. We'd like to thank all contributors for fixing bugs and expanding the functionality and documentation of Feature-engine. Thank you so much to all contributors and to those of you who created issues flagging bugs or requesting new functionality. New transformers ~~~~~~~~~~~~~~~~ - **ProbeFeatureSelection**: introduces random features and selects variables whose importance is greater than the random ones (`Morgan Sell `_ and `Soledad Galli `_) - **DatetimeSubtraction**: creates new features by subtracting datetime variables (`Kyle Gilde `_ and `Soledad Galli `_) - **GeometricWidthDiscretiser**: sorts continuous variables into intervals determined by geometric progression (`Gleb Levitski `_) New functionality ~~~~~~~~~~~~~~~~~ - Allow categorical encoders to encode variables with NaN (`Soledad Galli `_) - Make transformers compatible with new `set_output` functionality from sklearn (`Soledad Galli `_) - The `ArbitraryDiscretiser()` now includes the lowest limits in the intervals (`Soledad Galli `_) New modules ~~~~~~~~~~~ - New **Datasets** module with functions to load specific datasets (`Alfonso Tobar `_) - New **variable_handling** module with functions to automatically select numerical, categorical, or datetime variables (`Soledad Galli `_) Bug fixes ~~~~~~~~~ - Fixed bug in `DropFeatures()` (`Luís Seabra `_) - Fixed bug in `RecursiveFeatureElimination()` caused when only 1 feature remained in data (`Soledad Galli `_) Documentation ~~~~~~~~~~~~~ - Add example code snippets to the selection module API docs (`Alfonso Tobar `_) - Add example code snippets to the outlier module API docs (`Alfonso Tobar `_) - Add example code snippets to the transformation module API docs (`Alfonso Tobar `_) - Add example code snippets to the time series module API docs (`Alfonso Tobar `_) - Add example code snippets to the preprocessing module API docs (`Alfonso Tobar `_) - Add example code snippets to the wrapper module API docs (`Alfonso Tobar `_) - Updated documentation using new Dataset module (`Alfonso Tobar `_ and `Soledad Galli `_) - Reorganized Readme badges (`Gleb Levitski `_) - New Jupyter notebooks for `GeometricWidthDiscretiser` (`Gleb Levitski `_) - Fixed typos (`Gleb Levitski `_) - Remove examples using the boston house dataset (`Soledad Galli `_) - Update sponsor page and contribute page (`Soledad Galli `_) Deprecations ~~~~~~~~~~~~ - The class `PRatioEncoder` is no longer supported and was removed from the API (`Soledad Galli `_) Code improvements ~~~~~~~~~~~~~~~~~ - Massive improvement in the performance (speed) of `DropDuplicateFeatures()` (`Nodar Okroshiashvili `_) - Remove `inplace` and other issues related to pandas new direction (`Luís Seabra `_) - Move most docstrings to dedicated docstrings module (`Soledad Galli `_) - Unnest tests for encoders (`Soledad Galli `_)