高分辨率无人驾驶飞行器图像的杂草和作物分类 (Vision Transformers For Weeds and Crops Classification Of High Resolution UAV Images)

Crop and weed monitoring is an important challenge for agriculture and food production nowadays. Thanks to recent advances in data acquisition and computation technologies, agriculture is evolving to a more smart and precision farming to meet with the high yield and high quality crop production. Classification and recognition in Unmanned Aerial Vehicles (UAV) images are important phases for crop monitoring. Advances in deep learning models relying on Convolutional Neural Network (CNN) have achieved high performances in image classification in the agricultural domain. Despite the success of this architecture, CNN still faces many challenges such as high computation cost, the need of large labelled datasets, ... Natural language processing's transformer architecture can be an alternative approach to deal with CNN's limitations. Making use of the self-attention paradigm, Vision Transformer (ViT) models can achieve competitive or better results without applying any convolution operations. In this paper, we adopt the self-attention mechanism via the ViT models for plant classification of weeds and crops: red beet, off-type beet (green leaves), parsley and spinach. Our experiments show that with small set of labelled training data, ViT models perform better compared to state-of-the-art CNN-based models EfficientNet and ResNet, with a top accuracy of 99.8\% achieved by the ViT model.

翻译：由于最近在数据获取和计算技术方面的进步,农业正在演变为更聪明和精准的农业,以适应高产量和高质量作物生产。在无人驾驶航空飞行器(UAV)图像中的分类和识别是作物监测的重要阶段。依靠革命神经网络(CNN)的深层次学习模式在农业领域图像分类方面取得了高绩效。尽管这一架构取得了成功,但CNN仍面临许多挑战,如计算成本高、需要大标签数据集、......自然语言处理变异器结构可以成为处理CNN局限性的替代方法。利用自我注意模式,愿景变异器(VIT)模型可以在不应用任何革命操作的情况下实现竞争或更好的结果。在本文件中,我们采用VIT模型对杂草和作物进行植物分类的自留机制:红色贝特、离型贝特(绿色叶)、帕斯利和菠拉奇。我们的实验显示,用小标签培训数据集,ViT+T的模型和SISNF8的顶级网络,通过SISNA-NBS-NBS-NM-S-S-G-S-S-SQ-SQ-SQ-SQ-SQ-SQ-SQ-SQ-S-S-SQ-S-S-PAR-S-S-SQ-SQ-SQ-P-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-P-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

以自我为中心的视觉分析综述（Analysis of the hands in egocentric vision: A survey）

专知会员服务

5+阅读 · 2019年12月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日