Deep neural networks (DNN) have performed impressively in the processing of multimedia signals. Most DNN-based approaches were developed to handle real-valued data; very few have been designed for complex-valued data, despite their being essential for processing various types of multimedia signal. Accordingly, this work presents a complex-valued deep recurrent neural network (C-DRNN) for singing voice separation. The C-DRNN operates on the complex-valued short-time discrete Fourier transform (STFT) domain. A key aspect of the C-DRNN is that the activations and weights are complex-valued. The goal herein is to reconstruct the singing voice and the background music from a mixed signal. For error back-propagation, ℂℝ-calculus is utilized to calculate the complex-valued gradients of the objective function. To reinforce model regularity, two constraints are incorporated into the objective function of the C-DRNN. The first is an additional masking layer that ensures the sum of separated sources equals the input mixture. The second is a discriminative term that preserves the mutual difference between two separated sources. Finally, the proposed method is evaluated using the MIR-1K dataset and a singing voice separation task. Experimental results demonstrate that the proposed method outperforms the state-of-the-art DNN-based methods.