The integration of Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is increasingly central to the development of intelligent autonomous systems for applications such as search and rescue, environmental monitoring, and logistics. However, precise coordination between these platforms in real-time scenarios presents major challenges, particularly when external localization infrastructure such as GPS or GNSS is unavailable or degraded [1]. This paper proposes a vision-based, data-driven framework for real-time UAV-UGV integration, with a focus on robust UGV detection and heading angle prediction for navigation and coordination. The system employs a fine-tuned YOLOv5 model to detect UGVs and extract bounding box features, which are then used by a lightweight artificial neural network (ANN) to estimate the UAV's required heading angle. A VICON motion capture system was used to generate ground-truth data during training, resulting in a dataset of over 13,000 annotated images collected in a controlled lab environment. The trained ANN achieves a mean absolute error of 0.1506° and a root mean squared error of 0.1957°, offering accurate heading angle predictions using only monocular camera inputs. Experimental evaluations achieve 95% accuracy in UGV detection. This work contributes a vision-based, infrastructure- independent solution that demonstrates strong potential for deployment in GPS/GNSS-denied environments, supporting reliable multi-agent coordination under realistic dynamic conditions. A demonstration video showcasing the system's real-time performance, including UGV detection, heading angle prediction, and UAV alignment under dynamic conditions, is available at: https://github.com/Kooroshraf/UAV-UGV-Integration
翻译:无人机与无人地面车辆的集成在智能自主系统开发中日益重要,广泛应用于搜救、环境监测和物流等领域。然而,在实时场景中实现这些平台间的精确协调面临重大挑战,尤其是在GPS或GNSS等外部定位基础设施不可用或性能下降的情况下[1]。本文提出了一种基于视觉、数据驱动的实时无人机-无人地面车辆集成框架,重点关注用于导航与协调的鲁棒性无人地面车辆检测及航向角预测。该系统采用微调的YOLOv5模型检测无人地面车辆并提取边界框特征,随后通过轻量级人工神经网络估计无人机所需航向角。训练过程中使用VICON运动捕捉系统生成真实数据,在受控实验室环境下收集了超过13,000张标注图像的数据集。经训练的人工神经网络实现了0.1506°的平均绝对误差和0.1957°的均方根误差,仅通过单目相机输入即可提供精确的航向角预测。实验评估显示无人地面车辆检测准确率达95%。本研究贡献了一种基于视觉、不依赖基础设施的解决方案,在GPS/GNSS拒止环境中展现出强大的部署潜力,支持现实动态条件下可靠的多智能体协调。展示系统实时性能(包括动态条件下的无人地面车辆检测、航向角预测及无人机对准)的演示视频可见于:https://github.com/Kooroshraf/UAV-UGV-Integration