Lightweight Monocular Depth Estimation through Guided Decoding

Michael Rudolph, Youssef Dawoud, Ronja Guldenring, Lazaros Nalpantidis, Vasileios Belagiannis

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

We present a lightweight encoder-decoder architecture for monocular depth estimation, specifically designed for embedded platforms. Our main contribution is the Guided Upsampling Block (GUB) for building the decoder of our model. Motivated by the concept of guided image filtering, GUB relies on the image to guide the decoder on upsampling the feature representation and the depth map reconstruction, achieving high resolution results with fine-grained details. Based on multiple GUBs, our model outperforms the related methods on the NYU Depth V2 dataset in terms of accuracy while delivering up to 35.1 fps on the NVIDIA Jetson Nano and up to 144.5 fps on the NVIDIA Xavier NX. Similarly, on the KITTI dataset, inference is possible with up to 23.7 fps on the Jetson Nano and 102.9 fps on the Xavier NX. Our code and models are made publicly available14https://github.com/mic-rud/GuidedDecoding.
Original languageEnglish
Title of host publicationProceedings of 2022 International Conference on Robotics and Automation
PublisherIEEE
Publication date2022
Pages2344-2350
ISBN (Print)978-1-7281-9682-4
DOIs
Publication statusPublished - 2022
Event2022 International Conference on Robotics and Automation - Philadelphia, United States
Duration: 23 May 202227 May 2022

Conference

Conference2022 International Conference on Robotics and Automation
Country/TerritoryUnited States
CityPhiladelphia
Period23/05/202227/05/2022

Fingerprint

Dive into the research topics of 'Lightweight Monocular Depth Estimation through Guided Decoding'. Together they form a unique fingerprint.

Cite this