In IEEE 802.11, CSMA/CA protocol applies the exponential backoff scheme to relax the contention problems among different clients wishing to transmit data at the same time. A client shall randomly choose a number of time slots bounded by the contention window. As the size of initial contention window is fixed for each device without considering the congestion status of the network, it may worsen the congestion condition for smaller contention window size, or may waste radio resource for larger window size in traditional scheme. In this paper, we propose the reinforcement learning model rewarded by throughput to dynamically adjust the contention window periodically. The Q-learning model is utilized in the proposed scheme and the reward function is determined by the current throughput measured during a certain interval and which measured in previous interval. The simulation results were compared with the rule based approach used in current 802.11 networks. The results show that the proposed scheme can effectively decrease the collision rate and the system throughput increases significantly accordingly. We also change the number of clients during the simulation time to examine the adjustment capability of the proposed scheme. The results also show that the proposed scheme is still superior to the rule based scheme in such environment.