GKS Algorithmic Technique for Early Defect Prediction (GKS: A Genetic Feature K-means Clustering with Support Vector Machines)

1P. Patchaiammal and R. Thirumalaiselvi


Post production defects are one of the reasons behind the rework and also the failure of software. To reduce the defects, we have to find features in earlier post production. Many experts will be developing intelligent decision support systems related to software to get better ability in detection of defect. The defect identification and discovery using machine learning techniques provide the reasonable result with accuracy. In order to stimulate the accuracy level, we combine supervised and unsupervised learning techniques. This helps to design an intelligent decision support system for early defect prediction. In our paper, clustering and classification algorithms are combined with genetic feature set to form GKS (A Genetic feature K-means clustering with Support Vector Machines) technique. This proposed algorithmic technique works in three parts, first a predictive analysis is carried out on Bugzilla eclipse Dataset and features are collected by using genetic algorithm and followed by the second part in which clustering set is formed by unsupervised clustering technique known as K-Means clustering and finally the performance parameters like precision, recall, f1 – score and accuracy level are found using supervised classification technique known as SVM (Support Vector Machine).The results are mapped into the roc – auc curve. At last, the clustered labels are mapped with the defect taxonomy list to categorize the defect features in appropriate defect occurring phase. In this paper, the feature classification is improved by using K-Means centroid algorithm with the help of SVM technique. This paper also provides a model to implement the GKS algorithmic technique. The focus of our model is used to classify the feature according to post production defect list so as to have better defect taxonomy.


Machine Learning (ML), GKS (A Genetic Feature K-means Clustering with Support Vector Machines), Feature Engineering, MAE (Mean Absolute Error), AUC (Area Under the Curve), ROC (Receiver Operating Characteristic curve).

Paper Details
IssueIssue 1