To SMOTE or not to SMOTE?
Summary of the paper https://arxiv.org/pdf/2201.08528.pdf
This paper discusses the effect of the balancing techniques (SMOTE, under/over sampling) for imbalanced samples.
- When the objective metric is proper: A metric is proper when it is optimized by a classifier predicting the true class probabilities. For example, it is easy to see that Brier score is proper and even though AUC is generally not proper under the i.i.d assumption it is proper.
- One can empirically show that balancing could improve prediction performance for weak classifiers such as MLP, SVM, decision tree, Adaboost and LGBM but not for the SOTA classifiers (XGBoost and Catboost). The strong classifiers (without balancing) yield better prediction quality than the weak classifiers with balancing.
- When the objective is a label metric:
- Fixed threshold:
- balancing considerably improved prediction performance for all classifiers.
- Optimized threshold:
- strong and medium classifiers: Balancing and optimizing the decision threshold provide similar prediction quality. However, optimizing the decision threshold is recommended due to simplicity and lower compute cost.
- very weak classifiers (MLP and SVM): balancing the data is significantly beneficial over the optimizing the decision threshold. Nevertheless, the resulting prediction quality will be significantly worse compared to using a strong classifier (without oversampling).
- When balancing (instead of optimizing the decision threshold) SMOTE-like methods were not significantly better than the simple random oversampler.
Conclusion
- Scenarios for which SMOTE-like oversampling can improve prediction performance and should be applied:
- in case of the proper metric:
- balancing is significantly effective when using a weak classifier
- in case of the label metric:
- When threshold optimization is possible
- balancing was beneficial (over optimization of decision threshold) only for the weak MLP and SVM classifiers. Best prediction for them was achieved by oversampling with SMOTE.
- When not possible to optimize decision threshold