Enhancing Algorithm Interpretability and Accuracy with Borderline-SMOTE and Bayesian Optimization
Published 2024-04-28
Keywords
- Credit score; Machine learning; Borderline-SMOTE; Bayesian optimization; XGBoost; SHAP tree interpreter
How to Cite
This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
In recent years, machine learning technologies have made significant advancements across various domains.However, in sectors such as credit scoring and healthcare, the limited interpretability of algorithms has led to concerns, especially for tasks that require high security, occasionally resulting in suboptimal decisions by organizations. Enhancing both the accuracy and interpretability of algorithmic models is crucial for optimal decision-making. To address this, the BorderlineSMOTE method is proposed for data balancing, incorporating a control factor, posFac, to finely adjust the randomness in generating new samples. Additionally, a Bayesian optimization approach is utilized to refine the performance of the XGBoost model. SHAP values are then employed to interpret and analyze the predictive results of the optimized XGBoost model, identifying the most impactful features and the characteristics of input features. This approach not only improves the predictive accuracy of the XGBoost model but also its interpretability, paving the way for broader research and application in variousfields.