Enhancing Imbalanced Data Handling Using MWMOTE and K-Means Clustering
Downloads
Machine learning and data mining, the quality of a dataset significantly influences model performance. One common issue is data imbalance, where one class in a dataset has significantly fewer samples than another. This imbalance can lead to biased models that favor the majority class, resulting in poor predictive performance for minority class instances. To address this issue, this study employs a resampling approach using the MWMOTE (Majority Weighted Minority Oversampling Technique) method, enhanced with K-Means Clustering. The MWMOTE algorithm generates synthetic samples for the minority class, while K-Means Clustering helps improve the distribution of generated samples by forming well-structured clusters. Experimental results on 10 different datasets demonstrate that the proposed MWMOTE + K-Means approach significantly improves classification performance. Compared to the baseline accuracy of 70%, the proposed method enhances precision by 10%, recall by 40%, and F-measure by 40%. However, the computational cost is slightly increased due to the additional clustering step required for synthetic data generation. Despite the increased computation time, the improvement in classification metrics suggests that integrating K-Means with MWMOTE is a promising technique for handling imbalanced data. Future research could explore optimizing the computational efficiency of this approach and comparing it with other oversampling techniques.
Copyright (c) 2025 Meida Cahyo Untoro (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).





