We present AVOD, an Aggregate View Object Detection network for autonomous driving scenarios. The proposed neural network architecture uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion on high resolution feature maps to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. Our proposed architecture is shown to produce state of the art results on the KITTI 3D object detection benchmark while running in real time with a low memory footprint, making it a suitable candidate for deployment on autonomous vehicles. Code is at: https://github.com/kujason/avod
翻译:我们介绍了AVOD,这是一个用于自主驾驶情景的综合视图物体探测网络。拟议的神经网络结构利用LIDAR点云和RGB图像生成两个子网络共享的特征:一个区域建议网络(RPN)和一个二级探测器网络。拟议的RPN使用一个能够对高分辨率特征地图进行多式联运特征聚合的新结构,以便为道路场景的多个物体类别生成可靠的三维天体建议。利用这些提议,第二阶段探测网络进行了精确定向的三维捆绑框回归和分类,以预测3D空间物体的范围、方向和分类。我们的拟议结构显示,在实时运行KITTI 3D天体探测基准的同时,在低记忆足迹下产生最新的最新结果,使它成为在自主车辆上部署的合适候选人。代码在https://github.com/kujason/avod:https://github.com/kujason/avod。