The 360° video, also known as omnidirectional video (ODV), immersive video or spherical video, has become increasing popular and drawn great research attention. To achieve the effectiveness of 360° videos, it is quite important to understand how human perceive and interact with 360° videos so that efficient techniques for their encoding, transmission, and rendering can be developed. The Visual Attention Estimation in HMD (Head Mounted Display) is a competition to encourage contestants to design lightweight models for predicting human eye attention in 360°videos. The models need to not only achieve high accuracy but also have outstanding performance on HMD devices. We employ the approach of ATSal  and fine-tune the expert models with the dataset provided in this event to achieve 0.840 AUC-J, 0.476 CC, 3.206 KLC, 1.478 NSS, 0.412 SIM, currently ranked the 1st place in the qualification competition leaderboard.