Learning to Remember Beauty Products

Toan H. Vu, An Dang, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


This paper develops a deep learning model for the beauty product image retrieval problem. The proposed model has two main components-an encoder and a memory. The encoder extracts and aggregates features from a deep convolutional neural network at multiple scales to get feature embeddings. With the use of an attention mechanism and a data augmentation method, it learns to focus on foreground objects and neglect background on images, so can it extract more relevant features. The memory consists of representative states of all database images as its stacks, and it can be updated during training process. Based on the memory, we introduce a distance loss to regularize embedding vectors from the encoder to be more discriminative. Our model is fully end-to-end, requires no manual feature aggregation and post-processing. Experimental results on the Perfect-500K dataset demonstrate the effectiveness of the proposed model with a significant retrieval accuracy.

Original languageEnglish
Title of host publicationMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Number of pages5
ISBN (Electronic)9781450379885
StatePublished - 12 Oct 2020
Event28th ACM International Conference on Multimedia, MM 2020 - Virtual, Online, United States
Duration: 12 Oct 202016 Oct 2020

Publication series

NameMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia


Conference28th ACM International Conference on Multimedia, MM 2020
Country/TerritoryUnited States
CityVirtual, Online


  • attention mechanism
  • beauty product image retrieval
  • memory
  • triplet loss


Dive into the research topics of 'Learning to Remember Beauty Products'. Together they form a unique fingerprint.

Cite this