3D vision technology is the process of estimating 3D geometry from 2D image data. In recent years, it has reached a maturity that allows for real world usage, no longer being conﬁned to a laboratory. We see this in the availability of commercial 3D scanners (e.g. Kinect, RealSense, GOM) and their many applications (e.g self-driving cars, automation, quality control). However, as any engineer knows the transition from lab to real world application is not trivial, often with unforeseen challenges. In this sense, much 3D vision technology is built on an unknown foundation, as there have been few studies on it’s practical problems and limitations. This thesis contributes to several subjects within the ﬁeld of 3D vision with such studies. These encompass data set creation, empirical evaluation and system engineering. Data sets are essential to quantitative evaluation and testing. Thus we have created datasets for two ﬁelds which are lacking in that area. The ﬁrst being a dataset for Non-Rigid Structure from Motion (NRSfM). NRSfM estimates the 3D geometry of a deforming object from a 2D point sequence, thus the dataset is comprised 2D point sequences with a recorded 3D reference. We accomplished this using structured light scanning and several stop-motion animatronics. This allowed for much greater deformation variety than what has previously been available. Structured light scanning provides dense reference geometry and surface normals, which allowed us to created occlusion-based missing data for each point sequence. Something which has not been done before. The second dataset is built for evaluation of rendering techniques for challenging scenes. In it, we record a series of images along with precise geometry, radiometry, environment and camera pose. The intent is for a rendering algorithm to use said data to recreate the recorded image. Data sets serves little purpose unless they are used. Therefore, we have applied our NRSfM dataset to analyze the ﬁeld using 16 methods representative of the state-ofthe-art. Our factorial analysis shows not only which methods gives the most precise results, but also overall trends in the ﬁeld. For example which deformations are the most challenging to reconstruct and how the camera impacts reconstruction quality. We also show that the previous reliance on random missing data has lead to algorithms that handles the missing data from self-occlusion poorly. We have also evaluated several structured light techniques on biological material. Structured light is designed with the assumption of diﬀuse reﬂection, but most biological material has heavy subsurface scattering. We show that this results in subtle, systematic overestimation of depth (up to 1mm), even for state-of-the-art techniques. However, we also demonstrate that a large part of this error can be corrected with a linear, geometry based model. This thesis also presents some vision-based solutions to practical problem, as some information can only be gained through application. First, we investigate the interaction between 3D vision and robotics by engineering a solution for non-rigid bin picking. Our system shows that the problem is solvable, but error correction remains a big concern. Errors from multiple sources such as calibration, 3D scanner, segmentation and pose estimation might seem insigniﬁcant individually, but are problematic when taken as a whole. Second, we designed an algorithm for automatic measurement of contact surface areas for usage in tribology testing. The method performs measurements with an error of less than 0.4µm.
|Number of pages||142|
|Publication status||Published - 2018|
|Series||DTU Compute PHD-2018|