Taking full advantage of 3DIC technology requires innovative EDA tools that can optimize a multi-layered complex system. However, the optimization algorithms on the multi-layered 3DIC are usually computationally expensive. Parallel computing has been considered as a solution to manage the exploding computational requirement of future EDA tools. This paper proposes PP3D, a parallel 3DIC partitioning algorithm. PP3D enhances the execution speed by exposing the parallelism of FM algorithm. It also coordinates the parallel execution to retain the optimization quality. A design methodology is proposed to streamline the optimization from PP3D algorithm to the underlying GPGPU many-core architecture. The results on the ISPD98 benchmark demonstrate an average of 15X runtime speedup, while the maximum speedup can reach 37X.