With the explosive growth in demand for mobile traffic, one of the promising solutions is to offload cellular traffic to small base stations for better system efficiency. Due to increasing system complexity, network operators are facing severe challenges and looking for machine learning-based solutions. In this work, we propose an energy-aware mobile traffic offloading scheme in the heterogeneous network jointly apply deep Q network (DQN) decision making and advanced traffic demand forecasting. The base station control model is trained and verified on an open dataset from a major telecom operator. The performance evaluation shows that DQN with traffic forecasting outperforms others at all levels of mobile traffic demands. Also, the advantage of accurate traffic prediction is more significant under higher traffic loads.