In this paper, we propose an optimal relay transmission policy by using a stochastic energy harvesting (EH) model for the EH two-way relay network, wherein the relay is solar-powered and equipped with a finite-sized battery. In this policy, the long-Term average outage probability is minimized by adapting the relay transmission power to the wireless channel states, battery energy amount, and causal solar energy states. The designed problem is formulated as a Markov decision process (MDP) framework, and conditional outage probabilities for both decode-And-forward (DF) and amplify-And-forward (AF) cooperation protocols are adopted as the reward functions. We uncover a monotonic and bounded differential structure for the expected total discounted reward, and prove that such an optimal transmission policy has a threshold structure with respect to the battery energy amount in sufficiently high SNRs. Finally, the outage probability performance is analyzed and an interesting saturated structure for the outage performance is revealed, i.e., the expected outage probability converges to the battery empty probability in high SNR regimes, instead of going to zero. Furthermore, we propose a saturation-free condition that can guarantee a zero outage probability in high SNRs. Computer simulations confirm our theoretical analysis and show that our proposed optimal transmission policy outperforms other compared policies.