An efficient architecture with the fast algorithm for MPEG-4 shape coding is proposed. The authors apply the fast shape coding algorithm, with contour-based binary motion estimation (CBBME), which is based on the properties of a boundary mask. By using the block-matching motion estimation and the extended approach on centre-biased motion vector distribution with shrinking of the search range, a large number of search points in BME can be skipped. Based on this algorithm, a dedicated architecture design using the proposed CBBME algorithm is developed. With certain optimisation and design considerations, the memory access and processing cycles can be reduced. The average number of clock cycles for the processing of one binary alpha block is only 1708, which is only 56% of the previous design. In addition, a prototyping chip for shape coding is implemented and verified. The die area is 2.4 × 2.4 mm2 with TSMC 0.18 μm CMOS technology and the maximum clock frequency is 53 MHz.