As much as data science is playing a pivotal role everywhere, healthcare also finds it prominent application. Breast Cancer is the top rated type of cancer amongst women; which took away 627,000 lives alone. This high mortality rate due to breast cancer does need attention, for early detection so that prevention can be done in time. As a potential contributor to state-of-art technology development, data mining finds a multi-fold application in predicting Brest cancer. This work focuses on different classification techniques implementation for data mining in predicting malignant and benign breast cancer. Breast Cancer Wisconsin data set from the UCI repository has been used as experimental dataset while attribute clump thickness being used as an evaluation class. The performances of these twelve algorithms: Ada Boost M 1, Decision Table, J Rip, Lazy IBK, Logistics Regression, Multiclass Classifier, Multilayer Perceptron, Naive Bayes, Random forest and Random Tree are analyzed on this data set. Keywords- Data Mining, Classification Techniques, UCI repository, Breast Cancer, Classification Algorithms
翻译:尽管数据科学在世界各地都发挥着关键作用,但保健也发现它具有突出的应用作用。乳腺癌是女性中排名最高的癌症类型,它单独夺走了627 000人的生命。乳腺癌的这一高死亡率确实需要注意,以便早期发现,从而能够及时进行预防。数据开采作为最新技术开发的潜在贡献者,在预测布雷斯特癌症方面发现一个多方面的应用。这项工作的重点是在预测恶性和良性乳腺癌方面数据开采数据的不同分类技术应用。来自UCI储存库的乳腺癌威斯康星数据集一直被用作实验数据集,而属性宽厚则用作评估类。这12种算法的性能:Ada Boost M 1、决定表、J Rip、Lazy IBK、物流回归、多级分类器、多级 Percepron、Nive Bayes、随机森林和随机树。在这个数据集上进行了分析。关键词-数据挖掘、分类技术、UCI储存、乳腺癌、乳腺癌、分类等。