Utilizing deep learning, and especially Generative Adversarial Networks (GANs), for super-resolution images has yielded auspicious results. However, performing super resolutions with a big difference in scaling between input and output will add a certain degree of difficulty. In this paper we propose a super resolution with multiple steps, which means scaling the image gradually to stimulate maximum results. Video super resolution (VSR) needs different treatment from single image super resolution (SISR). It requires a temporal connection in between the frames, but this has not been fully explored by most of the existing studies. This temporal feature is significant to maintain the video consistency, in term of video quality and motion continuity. Using this loss functions, we can avoid the inconsistent failure in the image which accumulate continuously over time. Finally, our method has been shown to generate a super-resolution video that maintains both the video quality and its motion continuity. The quantitative result has higher Peak Signal to Noise Ratio (PSNR) scores for the Vimeo90K, Vid4, and Fireworks datasets with 37.70, 29.91, and 31.28 respectively compared to the state-of-the-art methods. The result shows that our models is better than other state-of-the-art methods using a different dataset.