HAPiCLR: heuristic attention pixel-level contrastive loss representation learning for self-supervised pretraining

Van Nhiem Tran, Shen Hsuan Liu, Chi En Huang, Muhammad Saqlain Aslam, Kai Lin Yang, Yung-Hui Li, Jia Ching Wang

研究成果: 雜誌貢獻期刊論文同行評審

摘要

Recent self-supervised contrastive learning methods are powerful and efficient for robust representation learning, pulling semantic features from different cropping views of the same image while pushing other features away from other images in the embedding vector space. However, model training for contrastive learning is quite inefficient. In the high-dimensional vector space of the images, images can differ from each other in many ways. We address this problem with heuristic attention pixel-level contrastive loss for representation learning (HAPiCLR), a self-supervised joint embedding contrastive framework that operates at the pixel level and makes use of heuristic mask information. HAPiCLR leverages pixel-level information from the object’s contextual representation instead of identifying pair-wise differences in instance-level representations. Thus, HAPiCLR enhances contrastive learning objectives without requiring large batch sizes, memory banks, or queues, thereby reducing the memory footprint and the processing needed for large datasets. Furthermore, HAPiCLR loss combined with other contrastive objectives such as SimCLR or MoCo loss produces considerable performance boosts on all downstream tasks, including image classification, object detection, and instance segmentation.

原文???core.languages.en_GB???
期刊Visual Computer
DOIs
出版狀態已被接受 - 2024

指紋

深入研究「HAPiCLR: heuristic attention pixel-level contrastive loss representation learning for self-supervised pretraining」主題。共同形成了獨特的指紋。

引用此