Toward deeper understanding systemic risk and overconfidence, we propose the reinforcement learning system of interbank lending and borrowing under stochastic environment with partial information. All banks are allowed to minimize their transaction costs through the linear quadratic regulator for lending money to or borrowing money from a central bank. In addition, in order to describe overconfidence using the proposed model, we modify the parameter driven by some stochastic economic factors. Furthermore, owing to strategies for all players driven by the distribution uncertainty, reinforcement learning procedure must be applied to obtain the most efficient optimal strategies and then a possibleNash equilibrium. The existence and uniqueness of the obtained equilibrium must be verified. Based on the solution, the financial implication is also discussed for understanding overconfidence and preventing the financial system from systemic risk.