A novel trend emerged in music exploration is to organize and search songs according to their emotions. However, research on automatic playlist generation (APG) primarily focuses on metadata and audio similarity. Mainstream solutions view APG as a static problem. This paper argues that the APG problem is better modeled as a continuous optimization problem, and proposes an adaptive preference model for personalized APG based on emotions. The main idea is to collect a user's behavior in music playing, e.g., rating, skipping and replaying, as immediate feedback in learning the user's preferences for music emotion within a playlist. Reinforcement learning is adopted to learn the user's current preferences, which are used to generate personalized playlists. Learning parameters are tuned by simulation of two hypothetical users. A two-month user study is conducted to evaluate the APG solutions. The results show that the proposed approach reduces the Miss Ratio by 10% in comparison with the baseline approach.