Comparative Study of Filter, Wrapper, and Hybrid Feature Selection Using Tree-Based Classifiers for Software Defect Prediction

Authors

November 26, 2025
December 10, 2025
December 27, 2025

Downloads

Software defect prediction (SDP) is essential for improving software reliability by enabling the early identification of modules that may contain defects before the release stage. SDP commonly exhibits redundant or non-contributory metrics, underscoring the need for feature selection to derive a more informative subset. To address this problem, the present study investigates and compares the effectiveness of three feature-selection strategies: SelectKBest (SKB), Recursive Feature Elimination (RFE), and the hybrid SKB+RFE, in enhancing the performance of tree-based classifiers on the NASA Metrics Data Program (MDP) data collections. The study utilizes three classification algorithms, namely Random Forest (RF), Extra Trees (ET), and Bagging (Decision Tree), with Area Under the Curve (AUC) serving as the primary metric for assessing model performance. Experimental results reveal that the RFE and Extra Trees combination yields the top performance, producing an average AUC of 0.7855. This is subsequently followed by the SKB+RFE+ET configuration, which achieves an AUC of 0.7809, and SKB+ET at 0.7776. These findings demonstrate that iterative wrapper-based approaches such as RFE can identify more relevant and effective feature subsets than filter or hybrid strategies, with the RFE+Extra Trees configuration yielding the strongest overall predictive performance and wrapper-based methods exhibiting higher stability across heterogeneous datasets. Even without hyperparameter tuning and relying solely on class-weighting rather than explicit resampling techniques, the findings offer empirical insight into the isolated influence of feature selection on predictive performance. Overall, the study confirms that RFE combined with Extra Trees offers the strongest predictive performance on NASA MDP data collections and forms a foundation for developing more adaptive and robust models.

How to Cite

Rahmayanti, R., Herteno, R. ., Saputro, S. W. ., Saragih, T. H. ., & Abadi, F. . (2025). Comparative Study of Filter, Wrapper, and Hybrid Feature Selection Using Tree-Based Classifiers for Software Defect Prediction. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 8(1), 1-16. https://doi.org/10.35882/ijeeemi.v8i1.294

Most read articles by the same author(s)

1 2 > >>