Real-world perception systems in many cases build on hardware with limited resources to adhere to cost and power limitations of their carrying system. Deploying deep neural networks on resource-constrained hardware became possible with model compression techniques, as well as efficient and hardware-aware architecture design. However, model adaptation is additionally required due to the diverse operation environments. In this work, we address the problem of training deep neural networks on resource-constrained hardware in the context of visual domain adaptation. We select the task of monocular depth estimation where our goal is to transform a pre-trained model to the target’s domain data. While the source domain includes labels, we assume an unlabelled target domain, as it happens in real-world applications. Then, we present an adversarial learning approach that is adapted for training on the device with limited resources. Since visual domain adaptation, i.e. neural network training, has not been previously explored for resource-constrained hardware, we present the first feasibility study for image-based depth estimation. Our experiments show that visual domain adaptation is relevant only for efficient network architectures and training sets at the order of a few hundred samples. Models and code are publicly available1.
|Title of host publication||Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops|
|Publication status||Published - 2021|
|Event||2021 IEEE/CVF International Conference on Computer Vision Workshops - Virtual Event, Montreal, Canada|
Duration: 11 Oct 2021 → 17 Oct 2021
|Conference||2021 IEEE/CVF International Conference on Computer Vision Workshops|
|Period||11/10/2021 → 17/10/2021|