Projects per year
Abstract
With the constant biomedical and technological advances, the worlds economic and social structures will be challenged in the following years: on the one side, a large amount of jobs will become automated and the work paradigm will shift from low paying repetitive tasks to high paying technical jobs. On the other side, the population pyramids will invert and so the amount of dependable people will be equal to those in the working age. These disruptive transformations will, without a doubt, require the development of robots capable of understanding and interacting with their surroundings; either to carry out mundane and risky tasks or to take care of the household, the elderly or the children. Therefore, the development of service, intelligent robots, is becoming more a necessity than a luxury. These robots must be able move in, perceive and interact with the world around them. The main goal of this research has been the development of a general perception pipeline for robotic platforms to recognize, detect and segment objects in a dynamic and cluttered environment, with no prior knowledge. To this end, the synergy between Machine Learning (Deep Neural Networks) and 3D segmentation methods was explored. The Deep Learning foundations, architectures and mathematical theory have been assessed to find the best suited network models for solving the object recognition, detection and segmentation tasks. Two main state-of-the-art network architectures were evaluated, retrained and implemented for diverse applications: YOLO and MaskRCNN. The datasets employed were both standard (COCO, SUNRGB-D, ImageNet) and custom (Doors and cabinets, Maritime). The 2D instance segmentation derived from the forward pass of the networks works as a Region of Interest that masks a rectified depth image. 3D points contained within the objects physical space were then generated. Noise and background information was filtered using several geometrical segmentation procedures based on the objects label; these include primitive fitting RANSAC, euclidean distance and region growing segmentation. Occlusions and geometrical errors when reconstructing the objects model were reduced through multi-viewpoint perspectives and point cloud registration using the ICP/NICP algorithms. Thus, 3D models of all recognized classes were derived simultaneously, in real time, with a high degree of precision and robustness. From said reconstructions, additional information can be obtained such as position, orientation, size and grasping data. The robots pre-learnt behaviours were adapted based on the data extracted from the surroundings; otherwise, new behaviours were inferred from previous knowledge. This perception pipeline was also integrated into more complex systems (from small mobile to humanoid robots) to demonstrate the feasibility of intelligent service robots that can interact with the world around them. Therefore, additional modules had to be combined, such as: sensor calibration (RGB-D cameras, laser scanners) and robot modelling (ROS and MoveIt!), motion planning (autonomous navigation, arm trajectory planning and manipulation), sensory data acquisition, fusion and processing (for localization and perception), object grasping and manipulation, episodic memorization (Deep-ART) and reasoning (FF-planner) algorithms. These advanced systems were able to deal with the Task Intelligence problem by learning, reasoning and executing specific behaviours based on the environment. This framework was also extended with SLAM algorithms to create semantic maps of the environment, that is, a robust 3D map where all instances of classes are segmented and can be interacted with individually. The software integration was accomplished under the ROS framework. A large variety of libraries, packages and custom nodes were employed to build such complex systems. The experiments were carried out on both a simulated physical environment (Gazebo, Webots) as well as real-life scenarios (kitchens and offices) This dissertation presents the research conducted during the three years of the PhD program, both at the Technical University of Denmark (DTU) and the Korea Advanced Institute of Science and Technology (KAIST). The results are disseminated into 6 published conference papers (one of them published as a journal article too) and one additional report.
| Original language | English |
|---|
| Publisher | Technical University of Denmark |
|---|---|
| Number of pages | 212 |
| Publication status | Published - 2018 |
Fingerprint
Dive into the research topics of 'Merging Machine Learning and3D Imaging Methods for Complex Robot-Object Interaction Systems'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Managing Complex Systems and Applications fusing methods from Al and Control and Signal Processing
Maurin, A. L. (PhD Student), Ravn, O. (Main Supervisor), Andersen, N. A. (Supervisor), Kim, J.-H. (Supervisor), Lund, H. H. (Examiner), Krüger, N. (Examiner) & Nilsson, K. (Examiner)
Technical University of Denmark
15/12/2015 → 30/09/2019
Project: PhD