NL-DSE: Non-Local Neural Network with Decoder-Squeeze-and-Excitation for Monocular Depth Estimation

Tsung Han Tsai, Wei Chung Wan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Monocular Depth Estimation is a popular and challenging problem for many years. IR CNNs (Convolutional Neural Networks)-based method with encoder-decoder architecture is proposed and shows a reasonable result. In this paper, we propose a SE-Net-based module for the decoder part in the encoder-decoder architecture to improve the result. We proposed a DSE (Decoder-Squeeze-and-Excitation) module to deal with the whole up-sampling process globally for the decoder part. We also include the Non-local Network space attention method to design the Non-Local Decoder-Squeeze-and-Excitation (NL-DSE) module. The proposed NL-DSE module is installed and evaluated on the NYU Depth V2 dataset and achieves higher accuracy. Moreover, the design is independent of the encoder-decoder architecture and can be applied in the other encoder-decoder networks to have a more accurate network.

Original languageEnglish
Title of host publicationICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728163277
DOIs
StatePublished - 2023
Event48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2023-June
ISSN (Print)1520-6149

Conference

Conference48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Country/TerritoryGreece
CityRhodes Island
Period4/06/2310/06/23

Keywords

  • CNNs
  • SE-Net
  • encoder-decoder
  • monocular depth estimation
  • up-sampling recalibration

Fingerprint

Dive into the research topics of 'NL-DSE: Non-Local Neural Network with Decoder-Squeeze-and-Excitation for Monocular Depth Estimation'. Together they form a unique fingerprint.

Cite this