Releases: scikit-learn-contrib/imbalanced-learn
Version 0.8.1
Version 0.8.1
September 29, 2021
Maintenance
Make imbalanced-learn compatible with scikit-learn 1.0. #864 by Guillaume Lemaitre.
Version 0.8.0
Version 0.8.0
February 18, 2021
Changelog
New features
- Add the the function
imblearn.metrics.macro_averaged_mean_absolute_error
returning the average across class of the MAE. This metric is used in ordinal classification. #780 by Aurélien Massiot. - Add the class
imblearn.metrics.pairwise.ValueDifferenceMetric
to compute pairwise distances between samples containing only categorical values. #796 by Guillaume Lemaitre. - Add the class
imblearn.over_sampling.SMOTEN
to over-sample data only containing categorical features. #802 by Guillaume Lemaitre. - Add the possibility to pass any type of samplers in
imblearn.ensemble.BalancedBaggingClassifier
unlocking the implementation of methods based on resampled bagging. #808 by Guillaume Lemaitre.
Enhancements
- Add option
output_dict
inimblearn.metrics.classification_report_imbalanced
to return a dictionary instead of a string. #770 by Guillaume Lemaitre. - Added an option to generate smoothed bootstrap in `imblearn.over_sampling.RandomOverSampler. It is controled by the parameter shrinkage. This method is also known as Random Over-Sampling Examples (ROSE). #754 by Andrea Lorenzon and Guillaume Lemaitre.
Bug fixes
- Fix a bug in
imblearn.under_sampling.ClusterCentroids
wherevoting="hard"
could have lead to select a sample from any class instead of the targeted class. #769 by Guillaume Lemaitre. - Fix a bug in
imblearn.FunctionSampler
where validation was performed even withvalidate=False
when callingfit
. #790 by Guillaume Lemaitre.
Maintenance
- Remove requirements files in favour of adding the packages in the
extras_require
within thesetup.py
file. #816 by Guillaume Lemaitre. - Change the website template to use
pydata-sphinx-theme
. #801 by Guillaume Lemaitre.
Deprecation
- The context manager
imblearn.utils.testing.warns
is deprecated in 0.8 and will be removed 1.0. #815 by Guillaume Lemaitre.
Version 0.7.0
A release to bump the minimum version of scikit-learn to 0.23 with a couple of bug fixes.
Check the what's new for more information.
Version 0.6.2
Version 0.6.1
This is a bug-fix release to primarily resolve some packaging issues in version 0.6.0. It also includes minor documentation improvements and some bug fixes.
Changelog
Bug fixes
- Fix a bug in :class:
imblearn.ensemble.BalancedRandomForestClassifier
leading to a wrong number of samples used during fitting due max_samples and therefore a bad computation of the OOB score. :pr:656
by :user:Guillaume Lemaitre <glemaitre>
.
Version 0.6.0
Changelog
Changed models
..............
The following models might give some different sampling due to changes in
scikit-learn:
- :class:
imblearn.under_sampling.ClusterCentroids
- :class:
imblearn.under_sampling.InstanceHardnessThreshold
The following samplers will give different results due to change linked to
the random state internal usage:
- :class:
imblearn.over_sampling.SMOTENC
Bug fixes
.........
-
:class:
imblearn.under_sampling.InstanceHardnessThreshold
now take into
account therandom_state
and will give deterministic results. In addition,
cross_val_predict
is used to take advantage of the parallelism.
:pr:599
by :user:Shihab Shahriar Khan <Shihab-Shahriar>
. -
Fix a bug in :class:
imblearn.ensemble.BalancedRandomForestClassifier
leading to a wrong computation of the OOB score.
:pr:656
by :user:Guillaume Lemaitre <glemaitre>
.
Maintenance
...........
-
Update imports from scikit-learn after that some modules have been privatize.
The following import have been changed:
:class:sklearn.ensemble._base._set_random_states
,
:class:sklearn.ensemble._forest._parallel_build_trees
,
:class:sklearn.metrics._classification._check_targets
,
:class:sklearn.metrics._classification._prf_divide
,
:class:sklearn.utils.Bunch
,
:class:sklearn.utils._safe_indexing
,
:class:sklearn.utils._testing.assert_allclose
,
:class:sklearn.utils._testing.assert_array_equal
,
:class:sklearn.utils._testing.SkipTest
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
Synchronize :mod:
imblearn.pipeline
with :mod:sklearn.pipeline
.
:pr:620
by :user:Guillaume Lemaitre <glemaitre>
. -
Synchronize :class:
imblearn.ensemble.BalancedRandomForestClassifier
and add
parametersmax_samples
andccp_alpha
.
:pr:621
by :user:Guillaume Lemaitre <glemaitre>
.
Enhancement
...........
-
:class:
imblearn.under_sampling.RandomUnderSampling
,
:class:imblearn.over_sampling.RandomOverSampling
,
:class:imblearn.datasets.make_imbalance
accepts Pandas DataFrame in and
will output Pandas DataFrame. Similarly, it will accepts Pandas Series in and
will output Pandas Series.
:pr:636
by :user:Guillaume Lemaitre <glemaitre>
. -
:class:
imblearn.FunctionSampler
accepts a parametervalidate
allowing
to check or not the inputX
andy
.
:pr:637
by :user:Guillaume Lemaitre <glemaitre>
. -
:class:
imblearn.under_sampling.RandomUnderSampler
,
:class:imblearn.over_sampling.RandomOverSampler
can resample when non
finite values are present inX
.
:pr:643
by :user:Guillaume Lemaitre <glemaitre>
. -
All samplers will output a Pandas DataFrame if a Pandas DataFrame was given
as an input.
:pr:644
by :user:Guillaume Lemaitre <glemaitre>
. -
The samples generation in
:class:imblearn.over_sampling.SMOTE
,
:class:imblearn.over_sampling.BorderlineSMOTE
,
:class:imblearn.over_sampling.SVMSMOTE
,
:class:imblearn.over_sampling.KMeansSMOTE
,
:class:imblearn.over_sampling.SMOTENC
is now vectorize with giving
an additional speed-up whenX
in sparse.
:pr:596
by :user:Matt Eding <MattEding>
.
Deprecation
...........
-
The following classes have been removed after 2 deprecation cycles:
ensemble.BalanceCascade
andensemble.EasyEnsemble
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
The following functions have been removed after 2 deprecation cycles:
utils.check_ratio
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
The parameter
ratio
andreturn_indices
has been removed from all
samplers.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
The parameters
m_neighbors
,out_step
,kind
,svm_estimator
have been removed from the :class:imblearn.over_sampling.SMOTE
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
.
0.5.0
Version 0.5.0
Changed models
The following models or function might give different results even if the
same data X
and y
are the same.
- :class:
imblearn.ensemble.RUSBoostClassifier
default estimator changed from
:class:sklearn.tree.DecisionTreeClassifier
with full depth to a decision
stump (i.e., tree withmax_depth=1
).
Documentation
-
Correct the definition of the ratio when using a
float
in sampling
strategy for the over-sampling and under-sampling.
:issue:525
by :user:Ariel Rossanigo <arielrossanigo>
. -
Add :class:
imblearn.over_sampling.BorderlineSMOTE
and
:class:imblearn.over_sampling.SVMSMOTE
in the API documenation.
:issue:530
by :user:Guillaume Lemaitre <glemaitre>
.
Enhancement
-
Add Parallelisation for SMOTEENN and SMOTETomek.
:pr:547
by :user:Michael Hsieh <Microsheep>
. -
Add :class:
imblearn.utils._show_versions
. Updated the contribution guide
and issue template showing how to print system and dependency information
from the command line. :pr:557
by :user:Alexander L. Hayes <batflyer>
. -
Add :class:
imblearn.over_sampling.KMeansSMOTE
which is an over-sampler
clustering points before to apply SMOTE.
:pr:435
by :user:Stephan Heijl <StephanHeijl>
.
Maintenance
-
Make it possible to
import imblearn
and access submodule.
:pr:500
by :user:Guillaume Lemaitre <glemaitre>
. -
Remove support for Python 2, remove deprecation warning from
scikit-learn 0.21.
:pr:576
by :user:Guillaume Lemaitre <glemaitre>
.
Bug
-
Fix wrong usage of :class:
keras.layers.BatchNormalization
in
porto_seguro_keras_under_sampling.py
example. The batch normalization
was moved before the activation function and the bias was removed from the
dense layer.
:pr:531
by :user:Guillaume Lemaitre <glemaitre>
. -
Fix bug which converting to COO format sparse when stacking the matrices in
:class:imblearn.over_sampling.SMOTENC
. This bug was only old scipy version.
:pr:539
by :user:Guillaume Lemaitre <glemaitre>
. -
Fix bug in :class:
imblearn.pipeline.Pipeline
where None could be the final
estimator.
:pr:554
by :user:Oliver Rausch <orausch>
. -
Fix bug in :class:
imblearn.over_sampling.SVMSMOTE
and
:class:imblearn.over_sampling.BorderlineSMOTE
where the default parameter
ofn_neighbors
was not set properly.
:pr:578
by :user:Guillaume Lemaitre <glemaitre>
. -
Fix bug by changing the default depth in
:class:imblearn.ensemble.RUSBoostClassifier
to get a decision stump as a
weak learner as in the original paper.
:pr:545
by :user:Christos Aridas <chkoar>
. -
Allow to import
keras
directly fromtensorflow
in the
:mod:imblearn.keras
.
:pr:531
by :user:Guillaume Lemaitre <glemaitre>
.
0.4.3
0.4.2
Version 0.4.2
Bug fixes
- Fix a bug in imblearn.over_sampling.SMOTENC in which the the median of the standard deviation instead of half of the median of the standard deviation. By Guillaume Lemaitre in #491.
- Raise an error when passing target which is not supported, i.e. regression target or multilabel targets. Imbalanced-learn does not support this case. By Guillaume Lemaitre in #490.
0.4.1
Version 0.4
October, 2018
Version 0.4 is the last version of imbalanced-learn to support Python 2.7
and Python 3.4. Imbalanced-learn 0.5 will require Python 3.5 or higher.
Highlights
This release brings its set of new feature as well as some API changes to
strengthen the foundation of imbalanced-learn.
As new feature, 2 new modules imblearn.keras
and
imblearn.tensorflow
have been added in which imbalanced-learn samplers
can be used to generate balanced mini-batches.
The module imblearn.ensemble
has been consolidated with new classifier:
imblearn.ensemble.BalancedRandomForestClassifier
,
imblearn.ensemble.EasyEnsembleClassifier
,
imblearn.ensemble.RUSBoostClassifier
.
Support for string has been added in
imblearn.over_sampling.RandomOverSampler
and
imblearn.under_sampling.RandomUnderSampler
. In addition, a new class
imblearn.over_sampling.SMOTENC
allows to generate sample with data
sets containing both continuous and categorical features.
The imblearn.over_sampling.SMOTE
has been simplified and break down
to 2 additional classes:
imblearn.over_sampling.SVMSMOTE
and
imblearn.over_sampling.BorderlineSMOTE
.
There is also some changes regarding the API:
the parameter sampling_strategy
has been introduced to replace the
ratio
parameter. In addition, the return_indices
argument has been
deprecated and all samplers will exposed a sample_indices_
whenever this is
possible.