The optimal transmission policy is investigated for two-way (TW) energy harvesting (EH) dual-relay networks with a stochastic EH model. There exist two solar-powered relays which have finite-sized batteries, and at most one relay is selected to facilitate the network's information exchange. The objective is to optimize the network's long-term outage performance by adapting the relay selection and power allocation to the system stochastic conditions. Thus, a Markov decision process (MDP) is utilized to formulate the design framework. From the MDP, the optimal transmission policy of the network is derived to indicate the adaptive relay selection and power allocation, and the computation of the network's expected outage probability is presented. Further, the optimal transmission policy is asymptotically analyzed, and an active relaying property and a power convergency property of the optimal policy are pointed out. Moreover, according to the properties of the optimal policy, the long-term outage probabilities of the network are found to approach the limitation in sufficiently high SNRs, which is determined by the empty probability of the dual-battery under the optimal policy. Finally, computer simulations validate the analysis.