This letter addresses the 3D coverage path planning (CPP) problem for terrain reconstruction of unknown obstacle rich environments. Due to sensing limitations, the proposed method, called CT-CPP, performs layered scanning of the 3D region to collect terrain data, where the traveling sequence is optimized using the concept of a coverage tree (CT) with a TSP-inspired tree traversal strategy. The CT-CPP method is validated on a high-fidelity underwater simulator and the results are compared to an existing terrain following CPP method. The results show that CT-CPP yields significant reduction in trajectory length, energy consumption, and reconstruction error.
Over the years, the separate fields of motion planning, mapping, and human trajectory prediction have advanced considerably. However, the literature is still sparse in providing practical frameworks that enable mobile manipulators to perform whole-body movements and account for the predicted motion of moving obstacles. Previous optimisation-based motion planning approaches that use distance fields have suffered from the high computational cost required to update the environment representation. We demonstrate that GPU-accelerated predicted composite distance fields significantly reduce the computation time compared to calculating distance fields from scratch. We integrate this technique with a complete motion planning and perception framework that accounts for the predicted motion of humans in dynamic environments, enabling reactive and pre-emptive motion planning that incorporates predicted motions. To achieve this, we propose and implement a novel human trajectory prediction method that combines intention recognition with trajectory optimisation-based motion planning. We validate our resultant framework on a real-world Toyota Human Support Robot (HSR) using live RGB-D sensor data from the onboard camera. In addition to providing analysis on a publicly available dataset, we release the Oxford Indoor Human Motion (Oxford-IHM) dataset and demonstrate state-of-the-art performance in human trajectory prediction. The Oxford-IHM dataset is a human trajectory prediction dataset in which people walk between regions of interest in an indoor environment. Both static and robot-mounted RGB-D cameras observe the people while tracked with a motion-capture system.
In this paper, we present the solution of roadside LiDAR object detection using a combination of two unsupervised learning algorithms. The 3D point clouds data are firstly converted into spherical coordinates and filled into the azimuth grid matrix using a hash function. After that, the raw LiDAR data were rearranged into spatial-temporal data structures to store the information of range, azimuth, and intensity. Dynamic Mode Decomposition method is applied for decomposing the point cloud data into low-rank backgrounds and sparse foregrounds based on intensity channel pattern recognition. The Triangle Algorithm automatically finds the dividing value to separate the moving targets from static background according to range information. After intensity and range background subtraction, the foreground moving objects will be detected using a density-based detector and encoded into the state-space model for tracking. The output of the proposed model includes vehicle trajectories that can enable many mobility and safety applications. The method was validated against a commercial traffic data collection platform and demonstrated to be an efficient and reliable solution for infrastructure LiDAR object detection. In contrast to the previous methods that process directly on the scattered and discrete point clouds, the proposed method can establish the less sophisticated linear relationship of the 3D measurement data, which captures the spatial-temporal structure that we often desire.
Sampling-based motion planning algorithms are widely used in robotics because they are very effective in high-dimensional spaces. However, the success rate and quality of the solutions are determined by an adequate selection of their parameters such as the distance between states, the local planner, and the sampling distribution. For robots with large configuration spaces or dynamic restrictions, selecting these parameters is a challenging task. This paper proposes a method for improving the performance to a set of the most popular sampling-based algorithms, the Rapidly-exploring Random Trees (RRTs) by adjusting the sampling method. The idea is to replace the uniform probability density function (U-PDF) with a custom distribution (C-PDF) learned from previously successful queries in similar tasks. With a few samples, our method builds a custom distribution that allows the RRT to grow to promising states that will lead to a solution. We tested our method in several autonomous driving tasks such as parking maneuvers, obstacle clearance and under narrow passages scenarios. The results show that the proposed method outperforms the original RRT and several improved versions in terms of success rate, tree density and computation time. In addition, the proposed method requires a relatively small set of examples, unlike current deep learning techniques that require a vast amount of examples.
The optical scanning gauges mounted on the robots are commonly used in quality inspection, such as verifying the dimensional specification of sheet structures. Coverage path planning (CPP) significantly influences the accuracy and efficiency of robotic quality inspection. Traditional CPP strategies focus on minimizing the number of viewpoints or traveling distance of robots under the condition of full coverage inspection. The measurement uncertainty when collecting the scanning data is less considered in the free-form surface inspection. To address this problem, a novel CPP method with the optimal viewpoint sampling strategy is proposed to incorporate the measurement uncertainty of key measurement points (MPs) into free-form surface inspection. At first, the feasible ranges of measurement uncertainty are calculated based on the tolerance specifications of the MPs. The initial feasible viewpoint set is generated considering the measurement uncertainty and the visibility of MPs. Then, the inspection cost function is built to evaluate the number of selected viewpoints and the average measurement uncertainty in the field of views (FOVs) of all the selected viewpoints. Afterward, an enhanced rapidly-exploring random tree (RRT*) algorithm is proposed for viewpoint sampling using the inspection cost function and CPP optimization. Case studies, including simulation tests and inspection experiments, have been conducted to evaluate the effectiveness of the proposed method. Results show that the scanning precision of key MPs is significantly improved compared with the benchmark method.
Accurate rail location is a crucial part in the railway support driving system for safety monitoring. LiDAR can obtain point clouds that carry 3D information for the railway environment, especially in darkness and terrible weather conditions. In this paper, a real-time rail recognition method based on 3D point clouds is proposed to solve the challenges, such as disorderly, uneven density and large volume of the point clouds. A voxel down-sampling method is first presented for density balanced of railway point clouds, and pyramid partition is designed to divide the 3D scanning area into the voxels with different volumes. Then, a feature encoding module is developed to find the nearest neighbor points and to aggregate their local geometric features for the center point. Finally, a multi-scale neural network is proposed to generate the prediction results of each voxel and the rail location. The experiments are conducted under 9 sequences of 3D point cloud data for the railway. The results show that the method has good performance in detecting straight, curved and other complex topologies rails.
We present Neural A*, a novel data-driven search method for path planning problems. Despite the recent increasing attention to data-driven path planning, a machine learning approach to search-based planning is still challenging due to the discrete nature of search algorithms. In this work, we reformulate a canonical A* search algorithm to be differentiable and couple it with a convolutional encoder to form an end-to-end trainable neural network planner. Neural A* solves a path planning problem by encoding a problem instance to a guidance map and then performing the differentiable A* search with the guidance map. By learning to match the search results with ground-truth paths provided by experts, Neural A* can produce a path consistent with the ground truth accurately and efficiently. Our extensive experiments confirmed that Neural A* outperformed state-of-the-art data-driven planners in terms of the search optimality and efficiency trade-off, and furthermore, successfully predicted realistic human trajectories by directly performing search-based planning on natural image inputs.
3D Morphable Model (3DMM) based methods have achieved great success in recovering 3D face shapes from single-view images. However, the facial textures recovered by such methods lack the fidelity as exhibited in the input images. Recent work demonstrates high-quality facial texture recovering with generative networks trained from a large-scale database of high-resolution UV maps of face textures, which is hard to prepare and not publicly available. In this paper, we introduce a method to reconstruct 3D facial shapes with high-fidelity textures from single-view images in-the-wild, without the need to capture a large-scale face texture database. The main idea is to refine the initial texture generated by a 3DMM based method with facial details from the input image. To this end, we propose to use graph convolutional networks to reconstruct the detailed colors for the mesh vertices instead of reconstructing the UV map. Experiments show that our method can generate high-quality results and outperforms state-of-the-art methods in both qualitative and quantitative comparisons.
Semantic reconstruction of indoor scenes refers to both scene understanding and object reconstruction. Existing works either address one part of this problem or focus on independent objects. In this paper, we bridge the gap between understanding and reconstruction, and propose an end-to-end solution to jointly reconstruct room layout, object bounding boxes and meshes from a single image. Instead of separately resolving scene understanding and object reconstruction, our method builds upon a holistic scene context and proposes a coarse-to-fine hierarchy with three components: 1. room layout with camera pose; 2. 3D object bounding boxes; 3. object meshes. We argue that understanding the context of each component can assist the task of parsing the others, which enables joint understanding and reconstruction. The experiments on the SUN RGB-D and Pix3D datasets demonstrate that our method consistently outperforms existing methods in indoor layout estimation, 3D object detection and mesh reconstruction.
Single-image piece-wise planar 3D reconstruction aims to simultaneously segment plane instances and recover 3D plane parameters from an image. Most recent approaches leverage convolutional neural networks (CNNs) and achieve promising results. However, these methods are limited to detecting a fixed number of planes with certain learned order. To tackle this problem, we propose a novel two-stage method based on associative embedding, inspired by its recent success in instance segmentation. In the first stage, we train a CNN to map each pixel to an embedding space where pixels from the same plane instance have similar embeddings. Then, the plane instances are obtained by grouping the embedding vectors in planar regions via an efficient mean shift clustering algorithm. In the second stage, we estimate the parameter for each plane instance by considering both pixel-level and instance-level consistencies. With the proposed method, we are able to detect an arbitrary number of planes. Extensive experiments on public datasets validate the effectiveness and efficiency of our method. Furthermore, our method runs at 30 fps at the testing time, thus could facilitate many real-time applications such as visual SLAM and human-robot interaction. Code is available at https://github.com/svip-lab/PlanarReconstruction.
Recent advance in fluorescence microscopy enables acquisition of 3D image volumes with better quality and deeper penetration into tissue. Segmentation is a required step to characterize and analyze biological structures in the images. 3D segmentation using deep learning has achieved promising results in microscopy images. One issue is that deep learning techniques require a large set of groundtruth data which is impractical to annotate manually for microscopy volumes. This paper describes a 3D nuclei segmentation method using 3D convolutional neural networks. A set of synthetic volumes and the corresponding groundtruth volumes are generated automatically using a generative adversarial network. Segmentation results demonstrate that our proposed method is capable of segmenting nuclei successfully in 3D for various data sets.