公共文化服务平台

基于GMM非线性变换的说话人识别算法的研究被引量：1: 2017年; 针对与文本无关说话人识别GMM模型中,某些非目标模型的测试帧的模型得分可能会比较高,从而引起误判的问题。从帧似然概率的统计特性出发,提出了一种GMM非线性变换方法。该方法通过对每帧各模型的得分赋予不同的权值,使得得分高的模型权值大,得分低的模型权值小,由于目标模型得分高的帧要多于其他非目标模型,所以这样可以提高目标模型的总得分,降低非目标模型的得分,从而降低误判的可能。理论推导和实验结果表明,该变换方法能够提高GMM说话人识别的识别率。; 罗文华杨彦齐健赵力; 关键词：混合高斯模型

基于修正Fukunaga-Koontz变换的说话人识别方法: 2018年; 研究了修正Fukunaga-Koontz变换在说话人识别中的应用方法。通过修正Fukunaga-Koontz变换对说人语音特征空间进行了降维,并通过高斯混合模型进行说话人建模。采用NIST 2006年测试的1conv4w-1conv4w作为实验,对比了LDA方法与修正Fukunaga-Koontz变换在说话人识别中的识别性能。结果证实,将修正Fukunaga-Koontz变换用于说话人识别获得了理想的效果,与传统的LDA降维方法相比,识别性能得到了较大的提升。; 赵艳吕亮赵力; 关键词：说话人识别

实时助听器回声消除算法研究: 回声会导致助听器产生啸叫,损坏助听器设备,破坏患者的残余听力。为此,本文在助听器回声抵消模型的基础上,针对输入信号的能量变化,研究了自适应助听器回声抵消算法。通过对比标准最小均方差(NormalizedLeast Mea...; 仇晓梅吕晓敏房徐琪马安骏; 关键词：数字助听器回声抵消; 文献传递

一种分布式风力机信息采集和状态监测系统的设计被引量：4: 2016年; 为了对风电机组实时远程监控并实现分布式网络化管理,设计了一种基于ARM嵌入式系统的风电机组振动监测系统。系统有24通道的模拟信号采集电路,并借助于FPGA对周围电路进行逻辑控制和数据的实时采样;FPGA与ARM通信应用EDMA技术,极大提高了数据传输速率,可满足高速率采样的数据传输要求;此外,上位机与目标板之间的数据通信采用TCP/IP协议。通过实验观察上位机输出结果,验证了数据的实时性和准确性,达到了对风电机运行的状态信息监测和故障诊断的要求。; 李月芳梁瑞宇; 关键词：风电机组 ARM FPGA EDMA

Auditory attention model based on Chirplet for cross-corpus speech emotion recognition: 2016年; To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for feature extraction.First, in order to extract the spectra features, the auditory attention model is employed for variational emotion features detection. Then, the selective attention mechanism model is proposed to extract the salient gist features which showtheir relation to the expected performance in cross-corpus testing.Furthermore, the Chirplet time-frequency atoms are introduced to the model. By forming a complete atom database, the Chirplet can improve the spectrum feature extraction including the amount of information. Samples from multiple databases have the characteristics of multiple components. Hereby, the Chirplet expands the scale of the feature vector in the timefrequency domain. Experimental results show that, compared to the traditional feature model, the proposed feature extraction approach with the prototypical classifier has significant improvement in cross-corpus speech recognition. In addition, the proposed method has better robustness to the inconsistent sources of the training set and the testing set.; 张昕然宋鹏查诚陶华伟赵力

Intelligibility evaluation of enhanced whisper in joint time-frequency domain被引量：1: 2014年; Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.; 周健魏昕梁瑞宇赵力

An adaptive multichannel loudness compensation method: 2016年; To alleviate the conflict between audibility and distortion in the conventional loudness compensation method, an adaptive multichannel loudness compensation method is proposed for hearing aids. The linear and wide dynamic range compression （WDRC） methods are alternately employed according to the dynamic range of the band-passed signal and the hearing range （HR） of the patient. To further reduce the distortion caused by the WDRC and improve the output signal to noise ratio （SNR） under noise conditions, an adaptive adjustment of the compression ratio is presented. Experimental results demonstrate that the output SNR of the proposed method in babble noise is improved by at least 1.73 dB compared to the WDRC compensation method, and the average speech intelligibility is improved by 6.0% and 5. 7%, respectively, compared to the linear and WDRC compensation methods.; 王侠梁瑞宇王青云申红明赵力邹采荣

Annoyance-type speech emotion detection in working environment: 2013年; In order to recognize people＇s annoyance emotions in the working environment and evaluate emotional well- being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a Mandarin database with two thousands samples is built. In searching for annoyance-type emotion features, the prosodic feature and the voice quality feature parameters of the emotional statements are extracted first. Then an improved back propagation （BP） neural network based on the shuffled frog leaping algorithm （SFLA） is proposed to recognize the emotion. The recognition capability of the BP, radical basis function （RBF） and the SFLA neural networks are compared experimentally. The results show that the recognition ratio of the SFLA neural network is 4. 7% better than that of the BP neural network and 4. 3% better than that of the RBF neural network. The experimental results demonstrate that the random initial data trained by the SFLA can optimize the connection weights and thresholds of the neural network, speed up the convergence and improve the recognition rate.; 王青云赵力梁瑞宇张潇丹

Cooperative Compressive Spectrum Sensing in Cognitive Underw ater Acoustic Communication Networks: 2015年; Because of the specific of underwater acoustic channel,spectrum sensing entails many difficulties in cognitive underwater acoustic communication( CUAC) networks, such as severe frequency-dependent attenuation and low signal-to-noise ratios. To overcome these problems, two cooperative compressive spectrum sensing( CCSS) schemes are proposed for different scenarios( with and without channel state information). To strengthen collaboration among secondary users( SUs),cognitive central node( CCN) is provided to collect data from SUs. Thus,the proposed schemes can obtain spatial diversity gains and exploit joint sparse structure to improve the performance of spectrum sensing. Since the channel occupancy is sparse,we formulate the spectrum sensing problems into sparse vector recovery problems,and then present two CCSS algorithms based on path-wise coordinate optimization( PCO) and multi-task Bayesian compressive sensing( MT-BCS),respectively.Simulation results corroborate the effectiveness of the proposed methods in detecting the spectrum holes in underwater acoustic environment.; 左加阔陶文凤包永强赵力邹采荣

基于改进的深度神经网络的说话人辨认研究: 2017年; 说话人辨认技术在许多领域有着广泛的应用前景。首先研究了两种基本的深度神经网络模型(深度信念网络和降噪自编码)在说话人辨认上的应用,深度神经网络通过逐层无监督的预训练和有监督的反向微调避免了反向传播容易陷入局部最小值的缺陷,通过实验证明了当神经元个数达到一定数量之后深度网络模型是优于普通BP网络的,并且其性能随着网络规模的扩大而提升。考虑到大规模的深度网络训练时间较长的缺点,提出使用整流线性单元(Re LU)代替传统的sigmoid类函数对说话人识别的深度模型进行改进,实验结果表明改进后的深度模型平均训练时间减少了35%,平均误识率降低了8.3%。; 赵艳吕亮赵力; 关键词：说话人辨认

渝B2-20050021-1　渝公网安备 50019002500403号　违法和不良信息举报中心　互联网出版许可证　新出网证(渝)字10号

国家自然科学基金(61301219)

文献类型

领域

主题

机构

作者

传媒

年份

用户反馈

国家自然科学基金(61301219)

文献类型

领域

主题

机构

作者

传媒

年份

用户登录

用户反馈