Abstract
In this paper, we propose a fast time-frequency mask technique that relies on the sparseness of source signals for blind source separation (BSS) to separate a mixture of two input sounds in a single signal automatically. Due to the sparseness of source signals, the signal can be distinguished when it is transformed into the time-frequency domain. Most previous methods did not mention the effect of different angles on accuracy. To overcome such problems, we first define two features which are normalized level-ratio and phase-difference. Next, we use our method to decrease the variance of Direction of Arrival (DOA). This can reduce the variance of features so that it can reduce the iterations of k-means. Finally, a mask is generated according to the clustered features. Our method does not require any prior information or parameter estimation. The motivation of the proposed design is to incorporate the BSS system with some smart voice appliances. In the application scenario, all the non-human voices may appear and regard as interference. We use Signal to Distortion Ratio (SDR) and Signal to Interference Ratio (SIR) to make some comparison. Based on the proposed system, then we present a hardware design. We use the TSMC 90-nm CMOS process. As a cost-effective result, it consumes about 120 K gates and executes with a frequency of 10 MHz. The power consumption is only 2.92 mW with low power design considerations.
Original language | English |
---|---|
Pages (from-to) | 67-77 |
Number of pages | 11 |
Journal | Integration, the VLSI Journal |
Volume | 82 |
DOIs | |
State | Published - Jan 2022 |
Keywords
- Blind separation
- Convolutive BSS
- Reduction of DOA variance
- Time-frequency mask
- VLSI Design