This work develops a topic model-based hierarchical representation for identifying the latent characteristics behind the frame-level musical features. Frame-level features and music clips are regarded as acoustic words and acoustic documents, respectively. A Gaussian hierarchical latent Dirichlet allocation (G-hLDA) is proposed to find the latent topics behind the acoustic document. The G-hLDA directly handles the continuous features instead of transforming them into discrete words, reducing information loss from discretizationbased vector quantization. Specially, each latent topic that is identified by G-hLDA is represented as a node in the infinitely deep, infinitely branching tree. For a music clip, the number of its acoustic words at each node is computed to form the hierarchical representation. The proposed representation hierarchically captures not only the shared components but also the unique components among music clips, resulting in improved performance. The experimental results on the guitar playing technique database demonstrate that the proposed method outperforms baselines.