This paper presents a configurable multi-mode QR decomposition (QRD) processor. It supports 4×4 and 8×8 QRD with multi-layer sorting for single-user MIMO precoding. Besides, it can perform block-based sorting for multi-user MIMO precoding. Both forward mode for decomposition and backward mode for signal precoding are provided. This QRD processor is designed in pipelined systolic array. An in-place strategy with pointer-based control mechanism is proposed for sorting buffers, which reduces 42.3% D flip-flops. The proposed processor implemented in 90nm CMOS technology can generate 9.45MQRD/s for decomposing 8×8 channel matrix with sorting, and outperforms the related works in terms of throughput and hardware efficiency.