This paper proposes a learning-assisted beam search scheme for indoor millimeter wave (mmWave) networks with multi-base stations. Recently, directional antennas are often used to achieve the high data rates and compensate the high freespace loss in the mmWave frequency range. However, establishing reliable communication links with narrow beamwidth is a challenging task in indoor moving environments since the sector search space scales with device mobility and base station density. To tackle such an issue, we develop a multi-state Q-learning approach that incorporates the base station selection into the beam selection process. By exploiting the radio environment data from ray tracing simulation, the proposed learning approach can enable fast and reliable beam selection for different indoor environments and mobility patterns. Simulation results show that the proposed scheme outperforms the beam search schemes based on the existing exhaustive search approach and the original Q-learning approach in terms of beam search latency, link outage times, and aggregated throughput.