Deciphering pixel insights: A deep dive into deep learning strategies for enhanced indoor depth estimation

Krisna Pinasthika, Fitri Utaminingrum, Chih‑Yang Y. Lin, Chikamune Wada, Timothy K. Shih

Research output: Contribution to journalArticlepeer-review

Abstract

Depth estimation is one of the crucial tasks for autonomous systems, which provides important information about the distance between the system and its surroundings. Traditionally, Light Detection and Ranging and stereo cameras have been used for distance measurement, despite the significant cost. In contrast, monocular cameras offer a more cost-effective solution, but lack inherent depth information. The synergy of big data and deep learning has led to various advanced architectures for monocular depth estimation. However, due to the characteristics of the monocular depth estimation case that is ill posed problem, we incorporate Attention Gates (AG) within an encoder-decoder based architecture. This helps prevent pattern recognition failures caused by variations in object sizes that share identical depth values. Our research involves evaluating popular pretrained architectures, assessing the impact of using AG, and creating effective head blocks to tackle depth estimation challenges. Notably, our approach demonstrates improved evaluation metrics on the DIODE dataset, positioning Attention U-Net as a promising solution. Therefore, utilizing the superior performance obtained by Attention U-Net in performing monocular depth estimation on low-cost autonomous systems could relatively reduce the cost of using lidar or stereo cameras in measuring distance.1 https://github.com/KrisnaPinasthika/Deciphering-Pixel-Insights

Original languageEnglish
Article number100216
JournalInternational Journal of Information Management Data Insights
Volume4
Issue number1
DOIs
StatePublished - Apr 2024

Keywords

  • Attention Gates
  • Computer vision
  • Deep learning
  • Fully convolutional networks
  • Monocular depth estimation
  • Transfer learning

Fingerprint

Dive into the research topics of 'Deciphering pixel insights: A deep dive into deep learning strategies for enhanced indoor depth estimation'. Together they form a unique fingerprint.

Cite this