To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for feature extraction.First, in order to extract the spectra features, the auditory attention model is employed for variational emotion features detection. Then, the selective attention mechanism model is proposed to extract the salient gist features which showtheir relation to the expected performance in cross-corpus testing.Furthermore, the Chirplet time-frequency atoms are introduced to the model. By forming a complete atom database, the Chirplet can improve the spectrum feature extraction including the amount of information. Samples from multiple databases have the characteristics of multiple components. Hereby, the Chirplet expands the scale of the feature vector in the timefrequency domain. Experimental results show that, compared to the traditional feature model, the proposed feature extraction approach with the prototypical classifier has significant improvement in cross-corpus speech recognition. In addition, the proposed method has better robustness to the inconsistent sources of the training set and the testing set.
Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.
To alleviate the conflict between audibility and distortion in the conventional loudness compensation method, an adaptive multichannel loudness compensation method is proposed for hearing aids. The linear and wide dynamic range compression (WDRC) methods are alternately employed according to the dynamic range of the band-passed signal and the hearing range (HR) of the patient. To further reduce the distortion caused by the WDRC and improve the output signal to noise ratio (SNR) under noise conditions, an adaptive adjustment of the compression ratio is presented. Experimental results demonstrate that the output SNR of the proposed method in babble noise is improved by at least 1.73 dB compared to the WDRC compensation method, and the average speech intelligibility is improved by 6.0% and 5. 7%, respectively, compared to the linear and WDRC compensation methods.
In order to recognize people's annoyance emotions in the working environment and evaluate emotional well- being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a Mandarin database with two thousands samples is built. In searching for annoyance-type emotion features, the prosodic feature and the voice quality feature parameters of the emotional statements are extracted first. Then an improved back propagation (BP) neural network based on the shuffled frog leaping algorithm (SFLA) is proposed to recognize the emotion. The recognition capability of the BP, radical basis function (RBF) and the SFLA neural networks are compared experimentally. The results show that the recognition ratio of the SFLA neural network is 4. 7% better than that of the BP neural network and 4. 3% better than that of the RBF neural network. The experimental results demonstrate that the random initial data trained by the SFLA can optimize the connection weights and thresholds of the neural network, speed up the convergence and improve the recognition rate.
Because of the specific of underwater acoustic channel,spectrum sensing entails many difficulties in cognitive underwater acoustic communication( CUAC) networks, such as severe frequency-dependent attenuation and low signal-to-noise ratios. To overcome these problems, two cooperative compressive spectrum sensing( CCSS) schemes are proposed for different scenarios( with and without channel state information). To strengthen collaboration among secondary users( SUs),cognitive central node( CCN) is provided to collect data from SUs. Thus,the proposed schemes can obtain spatial diversity gains and exploit joint sparse structure to improve the performance of spectrum sensing. Since the channel occupancy is sparse,we formulate the spectrum sensing problems into sparse vector recovery problems,and then present two CCSS algorithms based on path-wise coordinate optimization( PCO) and multi-task Bayesian compressive sensing( MT-BCS),respectively.Simulation results corroborate the effectiveness of the proposed methods in detecting the spectrum holes in underwater acoustic environment.