The advent of the Internet and a large number of digital technologies has brought with it many different challenges. A large amount of data is found on the web, which in most cases is unstructured and unorganized, and this contributes to the fact that the use and manipulation of this data is quite a difficult process. Due to this fact, the usage of different machine and deep learning techniques for Text Classification has gained its importance, which improved this discipline and made it more interesting for scientists and researchers for further study. This paper aims to classify the pedagogical content using two different models, the K-Nearest Neighbor (KNN) from the conventional models and the Long short-term memory (LSTM) recurrent neural network from the deep learning models. The result indicates that the accuracy of classifying the pedagogical content reaches 92.52 % using KNN model and 87.71 % using LSTM model.
翻译:互联网和大量数字技术的出现带来了许多不同的挑战。在网络上发现了大量数据,在大多数情况下,这些数据是没有结构的和没有组织的,这促使使用和操纵这些数据是一个相当困难的过程。由于这一事实,使用不同的机器和深层学习技术来进行文本分类已变得非常重要,从而改进了这一学科,使科学家和研究人员更有兴趣进一步研究。本文件的目的是使用两种不同的模型对教学内容进行分类,即传统模型中的K-Nearest Neearbbor(KNN)和深层学习模型中的长期短期记忆(LSTM)经常性神经网络。结果显示,使用KNN模型对教学内容进行分类的准确性达到92.52%,使用LSTM模型对教学内容进行分类的准确性达到87.71%。