Researcher Collab

A Cluster Based Classification for Imbalanced Data Using SMOTE

IOP Conference Series Materials Science and Engineering

Abstract There is tremendous upturn in data repositories because of data generation by various organizations like government, cooperates, health caring in large amounts. Large amount of data is being produced, processed, collected, and analysed online. So there comes a requirement to transform this data into valuable information. This process of extracting the knowledge from large amount of data is referred as data mining. The proposed hybrid approach can be checked on different classifiers like Naïve Bayes, Random forest classifier etc. In proposed methodology we find that SMOTE algorithm which used K-nearest neighbour algorithm is limited to some minority class instances for creating synthetic samples, which sometimes leads to over fitting, so an effective oversampling approach can be developed.

Authors: Rajesh Kumar Tripathi, Linesh Raja, Ankit Kumar, Pankaj Dadheech, Abhishek Kumar, M N Nachappa

DOI: https://doi.org/10.1088/1757-899x/1099/1/012080

Publish Year: 2021