OPERATOR RECOGNITION DEVICE, OPERATOR RECOGNITION METHOD AND OPERATOR RECOGNITION PROGRAM
    1.
    发明申请
    OPERATOR RECOGNITION DEVICE, OPERATOR RECOGNITION METHOD AND OPERATOR RECOGNITION PROGRAM 失效
    操作员识别装置,操作者识别方法和操作者识别程序

    公开(公告)号:US20090254757A1

    公开(公告)日:2009-10-08

    申请号:US11910415

    申请日:2006-03-24

    IPC分类号: G06F21/00 G06K9/00 G10L17/00

    CPC分类号: G10L17/16 G10L17/10

    摘要: An operator recognition device is provided that eliminates the registration of data such as HMM data having a characteristic amount for which error in recognition occurs easily when recognizing an operator, and thus reduces the possibility of errors in recognition, and has stable recognition performance. When registering HMM data that is used when performing recognition processing, a speaker recognition device 100 eliminates the registration of HMM data of a password having a characteristic amount of the spoken voice component that is similar to a characteristic amount that is indicated by HMM data that is already registered, and does not allow the registration of HMM data for which it is estimated that error in recognition will occur easily during the recognition process.

    摘要翻译: 提供了一种操作者识别装置,其消除了在识别操作者时容易识别出具有识别错误的特征量的HMM数据的登记,从而降低识别错误的可能性,并且具有稳定的识别性能。 当注册执行识别处理时使用的HMM数据时,说话人识别装置100消除了具有与由HMM数据表示的特征量相似的口语语音成分的特征量的密码的HMM数据的注册, 已经注册,并且不允许HMM数据的注册,估计在识别过程中容易发生识别错误。

    Operator recognition device, operator recognition method and operator recognition program
    2.
    发明授权
    Operator recognition device, operator recognition method and operator recognition program 失效
    操作员识别装置,操作员识别方法和操作员识别程序

    公开(公告)号:US07979718B2

    公开(公告)日:2011-07-12

    申请号:US11910415

    申请日:2006-03-24

    IPC分类号: H04K1/00 H04L9/00

    CPC分类号: G10L17/16 G10L17/10

    摘要: An operator recognition device is provided that eliminates the registration of data such as HMM data having a characteristic amount for which error in recognition occurs easily when recognizing an operator, and thus reduces the possibility of errors in recognition, and has stable recognition performance. When registering HMM data that is used when performing recognition processing, a speaker recognition device 100 eliminates the registration of HMM data of a password having a characteristic amount of the spoken voice component that is similar to a characteristic amount that is indicated by HMM data that is already registered, and does not allow the registration of HMM data for which it is estimated that error in recognition will occur easily during the recognition process.

    摘要翻译: 提供了一种操作者识别装置,其消除了在识别操作者时容易识别出具有识别错误的特征量的HMM数据的登记,从而降低识别错误的可能性,并且具有稳定的识别性能。 当注册执行识别处理时使用的HMM数据时,说话人识别装置100消除了具有与由HMM数据表示的特征量相似的口语语音成分的特征量的密码的HMM数据的注册, 已经注册,并且不允许HMM数据的注册,估计在识别过程中容易发生识别错误。

    Voice recognition system
    3.
    发明申请
    Voice recognition system 审中-公开
    语音识别系统

    公开(公告)号:US20050091053A1

    公开(公告)日:2005-04-28

    申请号:US10995509

    申请日:2004-11-24

    CPC分类号: G10L25/78

    摘要: A trained vector creating part 15 creates a characteristic of an unvoiced sound in advance as a trained vector V. Meanwhile, a threshold value THD for distinguishing a voice from a background sound is created based on a predictive residual power ε of a sound which is created during a non-voice period. As a voice is actually uttered, an inner product computation part 18 calculates an inner product of a feature vector A of an input signal Sa and a trained vector V, and a first threshold value judging part 19 judges that it is a voice section when the inner product has a value which is equal to or larger than a predetermined value θ while a second threshold value judging part 21 judges that it is a voice section when the predictive residual power ε of the input signal Sa is larger than a threshold value THD. As at least one of the first threshold value judging part 19 and the second threshold value judging part 21 judges that it is a voice section, a voice section determining part 300 finally judges that it is a voice section and cuts out an input signal Saf which are in units of frames and corresponds to this voice section as a voice Svc which is to be recognized.

    摘要翻译: 经训练的矢量创建部分15预先创建无声声音的特性作为训练矢量V.同时,基于产生的声音的预测剩余功率ε创建用于区分语音与背景声音的阈值THD 在非语音期间。 由于实际上发出声音,内积计算部18计算输入信号Sa的特征矢量A和训练矢量V的内积,第一阈值判定部19判断为声音部时, 内积具有等于或大于预定值θ的值,而当输入信号Sa的预测残余功率ε大于阈值THD时,第二阈值判断部21判断为语音区。 由于第一阈值判定部19和第二阈值判定部21中的至少一个判断为声音部,所以语音部确定部300最终判断为声音部,切断输入信号Saf, 是以帧为单位,并且对应于该声音部分作为要识别的声音Svc。

    Voice recognition system
    4.
    发明授权
    Voice recognition system 失效
    语音识别系统

    公开(公告)号:US06937981B2

    公开(公告)日:2005-08-30

    申请号:US09954151

    申请日:2001-09-18

    摘要: A multiplicative distortion Hm(cep) is subtracted from a voice HMM 5, a multiplicative distortion Ha(cep) of the uttered voice is subtracted from a noise HMM 6 formed by HMM, and the subtraction results Sm(cep) and {Nm(cep)−Ha (cep)} are combined with each other to thereby form a combined HMM 18 in the cepstrum domain. A cepstrum R^a(cep) obtained by subtracting the multiplicative distortion Ha (cep) from the cepstrum Ra (cep) of the uttered voice is compared with the distribution R^m(cep) of the combined HMM 18 in the cepstrum domain, and the combined HMM with the maximum likelihood is output as the voice recognition result.

    摘要翻译: 从语音HMM 5中减去乘法失真Hm(cep),从由HMM形成的噪声HMM6中减去所发出的语音的乘法失真Ha(cep),并且减法结果Sm(cep)和{Nm (cep)通过从发射的倒谱中的倒谱谱(cep)中减去乘法失真Ha(cep)而得到的倒谱R ^ a(cep) 将语音与倒谱域中组合HMM18的分布R ^ m(cep)进行比较,并输出具有最大似然性的组合HMM作为语音识别结果。

    Voice recognition system
    5.
    发明授权
    Voice recognition system 失效
    语音识别系统

    公开(公告)号:US07016837B2

    公开(公告)日:2006-03-21

    申请号:US09953905

    申请日:2001-09-18

    IPC分类号: G10L15/20

    CPC分类号: G10L15/20 G10L15/142

    摘要: An initial combination HMM 16 is generated from a voice HMM 10 having multiplicative distortions and an initial noise HMM of additive noise, and at the same time, a Jacobian matrix J is calculated by a Jacobian matrix calculating section 19. Noise variation Namh (cep), in which an estimated value Ha^(cep) of the multiplicative distortions that are obtained from voice that is actually uttered, additive noise Na(cep) that is obtained in a non-utterance period, and additive noise Nm(cep) of the initial noise HMM 17 are combined, is multiplied by a Jacobian matrix, wherein the result of the multiplication and initial combination HMM 16 are combined, and an adaptive HMM 26 is generated. Thereby, an adaptive HMM 26 that is matched to the observation value series RNah(cep) generated from actual utterance voice can be generated in advance. When performing voice recognition by collating the observation value series RNah(cep) with adaptive HMM 26, influences due to the multiplicative distortions and additive distortions are counterbalanced, wherein an effect that is equivalent to a case where voice recognition is carried out with clean voice can be obtained, and a robust voice recognition system can be achieved.

    摘要翻译: 从具有乘法失真的语音HMM 10和加性噪声的初始噪声HMM生成初始组合HMM 16,同时由雅可比矩阵计算部分19计算雅可比矩阵J.噪声变化Namh(cep) ,其中从实际发出的语音获得的乘法失真的估计值Ha ^(cep),在非话语周期中获得的加性噪声​​Na(cep)和加法噪声Nm(cep) 初始噪声HMM 17被组合,乘以雅可比矩阵,其中乘法和初始组合HMM 16的结果被组合,并且生成自适应HMM 26。 由此,可以预先生成与实际的话语语音产生的观察值序列RNah(cep)相匹配的自适应HMM26。 通过将观测值序列RNah(cep)与自适应HMM26进行比较来进行语音识别时,由于乘法失真和附加失真引起的影响被平衡,其中与用干净的语音执行语音识别的情况相当的效果可以 并且可以实现鲁棒的语音识别系统。

    ACOUSTIC MODEL REGISTRATION APPARATUS, TALKER RECOGNITION APPARATUS, ACOUSTIC MODEL REGISTRATION METHOD AND ACOUSTIC MODEL REGISTRATION PROCESSING PROGRAM
    6.
    发明申请
    ACOUSTIC MODEL REGISTRATION APPARATUS, TALKER RECOGNITION APPARATUS, ACOUSTIC MODEL REGISTRATION METHOD AND ACOUSTIC MODEL REGISTRATION PROCESSING PROGRAM 审中-公开
    声学模型注册装置,听力识别装置,声学模型注册方法和声学模型注册处理程序

    公开(公告)号:US20100063817A1

    公开(公告)日:2010-03-11

    申请号:US12531219

    申请日:2007-03-14

    IPC分类号: G10L15/06

    CPC分类号: G10L17/04

    摘要: An acoustic model registration apparatus, an talker recognition apparatus, an acoustic model registration method and an acoustic model registration processing program, each of which prevents certainly an acoustic model having a low recognition capability for talker from being registered certainly, are provided.When a talker utters for the N utterances and the utterance sounds of the N utterances are input through the microphone 1, the sound feature quantity extraction part 4 extracts sound feature quantities which indicate the acoustic features of the input utterance sounds, wherein each sound feature quantity has one-to-one correspondence to each utterance, the talker model generation part 5 generates a talker model based on the extracted sound feature quantities for the N utterances, the collation part 6 calculates the degree of individual similarity between the each sound feature quantity of the N utterances and the talker model generated above, and only in the case that all the calculated degrees of similarities of the N utterances are equal to or more than the threshold value, the similarity verifying part 9 directs to register the generated talker model in the talker models' database as a talker model for the talker recognition.

    摘要翻译: 提供了一种声学模型登记装置,讲话者识别装置,声学模型登记方法和声学模型登记处理程序,其中,每个都可以防止当前登记具有低的识别能力的声学模型被登记。 当讲话者发出N个话语时,通过麦克风1输入N个发音的发声,声音特征量提取部分4提取指示输入话音的声音特征的声音特征量,其中每个声音特征量 讲话者模型生成部5基于所提取的N个特征量的声音特征量生成说话者模型,计算出每个声音特征量之间的个体相似度的程度, 上面产生的N个话语和说话者模型,并且只有在所有计算出的N个话语的相似程度等于或大于阈值的情况下,相似性验证部分9才指示将所产生的讲话者模型注册在 讲话者模型的数据库作为谈话者识别的谈话者模型。

    Apparatus and method for speech recognition
    7.
    发明授权
    Apparatus and method for speech recognition 失效
    用于语音识别的装置和方法

    公开(公告)号:US07257532B2

    公开(公告)日:2007-08-14

    申请号:US10667150

    申请日:2003-09-22

    申请人: Soichi Toyama

    发明人: Soichi Toyama

    IPC分类号: G10L15/00

    CPC分类号: G10L15/07 G10L15/20

    摘要: Before executing a speech recognition, a composite acoustic model adapted to noise is generated by composition of a noise adaptive representative acoustic model generated by noise-adaptation of each representative acoustic model and difference models stored in advance in a storing section, respectively. Then, the noise and speaker adaptive acoustic model is generated by executing speaker-adaptation to the composite acoustic model with the feature vector series of uttered speech. The renewal difference model is generated by the difference between the noise and speaker adaptive acoustic model and the noise adaptive representative acoustic model, to replace the difference model stored in the storing section therewith. The speech recognition is performed by comparing the feature vector series of the uttered speech to be recognized with the composite acoustic model adapted to noise and speaker generated by the composition of the noise adaptive representative acoustic model and the renewal difference model.

    摘要翻译: 在执行语音识别之前,通过分别由存储部分中预先存储的每个代表性声学模型和差分模型的噪声自适应产生的噪声自适应代表性声学模型的组合来生成适于噪声的复合声学模型。 然后,通过使用发声语音的特征向量序列对复合声学模型执行扬声器适应性来生成噪声和扬声器自适应声学模型。 更新差异模型由噪声和扬声器自适应声学模型与噪声自适应代表声学模型之间的差异产生,以代替存储在存储部分中的差分模型。 语音识别是通过将被识别的发声语音的特征向量序列与由噪声自适应代表声学模型和更新差分模型的组合产生的适合于噪声和扬声器的复合声学模型进行比较来执行的。

    Acoustic signal processing unit
    8.
    发明授权
    Acoustic signal processing unit 失效
    声信号处理单元

    公开(公告)号:US5444784A

    公开(公告)日:1995-08-22

    申请号:US64804

    申请日:1993-05-21

    申请人: Soichi Toyama

    发明人: Soichi Toyama

    IPC分类号: G10K15/12 H03G3/00

    CPC分类号: G10K15/12 Y10S84/26

    摘要: A sound echo machine as an acoustic signal processing unit of the present invention comprising an adder to which an input signal is fed, and a delay circuit for delaying the signal fed from the adder for a certain time to repeatedly feed back to the adder to generate an echo sound further comprises an input signal level detector for detecting the level of the input signal and sending it to a frequency oscillator to vary the oscillating frequency in accordance with the thus detected signal level for feeding it later to the delay circuit so as to modulate the time to be delayed at a predetermined cycle, whereby it can create an acoustic field in which a listener can feel as if various level of reflected sounds are coming towards him from various directions. On the other hand, a sound effecter as an acoustic signal processing unit comprising a plurality of acoustic signal processing sections, a plurality of attenuators each connected to these acoustic signal processing sections, and an adder for summing up all the signals from these attenuators further comprises a signal mixing ratio control section for monitoring the input acoustic signal level, and determining a signal mixing ratio among the respective output signals from the plurality of acoustic signal processing sections in accordance with the thus monitored level of the input acoustic signal, whereby even a simple structure can provide a specific sound effect.

    摘要翻译: 作为本发明的声音信号处理单元的声音回声机,包括输入信号被馈送的加法器和延迟电路,用于将从加法器馈送的信号延迟一定时间,以反复反馈给加法器,以产生 回波声音还包括输入信号电平检测器,用于检测输入信号的电平并将其发送到频率振荡器,以根据这样检测的信号电平来改变振荡频率,以便稍后将其馈送到延迟电路,以便调制 在预定的周期中被延迟的时间,从而可以产生声场,听众可以感觉到各种反射的声音的水平从各个方向到达他。 另一方面,作为声音信号处理单元的声音效果器包括多个声音信号处理部分,各个连接到这些声音信号处理部分的多个衰减器,以及用于对来自这些衰减器的所有信号进行求和的加法器,还包括 信号混合比控制部分,用于监测输入的声信号电平;以及根据由此监视的输入声信号的电平,确定来自多个声信号处理部分的各个输出信号之间的信号混合比,从而即使简单 结构可以提供特定的音效。

    Speech Recognition Device and Speech Recognition Method
    9.
    发明申请
    Speech Recognition Device and Speech Recognition Method 有权
    语音识别装置及语音识别方法

    公开(公告)号:US20080270127A1

    公开(公告)日:2008-10-30

    申请号:US11547322

    申请日:2005-03-15

    IPC分类号: G10L21/02 G10L15/20

    摘要: There is provided a voice recognition device and a voice recognition method that enhance the function of noise adaptation processing in voice recognition processing and reduce the capacity of a memory being used. Acoustic models are subjected to clustering processing to calculate the centroid of each cluster and the differential vector between the centroid and each model, model composition between each kind of assumed noise model and the calculated centroid is carried out, and the centroid of each composition model and the differential vector are stored in a memory. In the actual recognition processing, the centroid optimal to the environment estimated by the utterance environmental estimation is extracted from the memory, model restoration is carried out on the extracted centroid by using the differential vector stored in the memory, and noise adaptation processing is executed on the basis of the restored model.

    摘要翻译: 提供了一种语音识别装置和语音识别方法,其增强了语音识别处理中噪声适应处理的功能,并降低了正在使用的存储器的容量。 对声学模型进行聚类处理,计算每个聚类的质心和质心与每个模型之间的差分向量,进行各种假设噪声模型与计算出的质心之间的模型组合,以及每个组合模型的质心和 差分矢量存储在存储器中。 在实际识别处理中,从存储器中提取通过语音环境估计估计的对环境最佳的质心,通过使用存储在存储器中的差分矢量对所提取的质心进行模型恢复,并且执行噪声适应处理 恢复模式的基础。

    Speech recognition system with an adaptive acoustic model
    10.
    发明授权
    Speech recognition system with an adaptive acoustic model 失效
    具有自适应声学模型的语音识别系统

    公开(公告)号:US07065488B2

    公开(公告)日:2006-06-20

    申请号:US09964677

    申请日:2001-09-28

    IPC分类号: G10L15/28 G10L15/20 G10L21/02

    摘要: At the time of the speaker adaptation, first feature vector generation sections (7, 8, 9) generate a feature vector series [Ci, M] from which the additive noise and multiplicative noise are removed. A second feature vector generation section (12) generates a feature vector series [Si, M] including the features of the additive noise and multiplicative noise. A path search section (10) conducts a path search by comparing the feature vector series [Ci, m] to the standard vector [an, m] of the standard voice HMM (300). When the speaker adaptation section (11) conducts correlation operation on an average feature vector [S^n, m] of the standard vector [an, m] corresponding to the path search result Dv and the feature vector series [Si, m], the adaptive vector [xn, m] is generated. The adaptive vector [xn, m] updates the feature vector of the speaker adaptive acoustic model (400) used for the speech recognition.

    摘要翻译: 在说话者适应时,第一特征向量生成部分(7,8,9)生成除去附加噪声和乘法噪声的特征向量序列[C i,M i]。 第二特征向量生成部(12)生成包括加性噪声和乘法噪声的特征的特征矢量序列[S i,i,M]。 路径搜索部分(10)通过将特征向量序列[C i,i,m]与标准的标准矢量[a N,m,]进行比较来进行路径搜索 语音HMM(300)。 当扬声器适配部分(11)针对对应于该信号的标准矢量[a,n,m]的平均特征矢量[S ^ N,m]]进行相关运算时 路径搜索结果Dv和特征向量序列[S i,m,]生成自适应向量[x N,m N]。 自适应矢量[x N,m N]更新用于语音识别的扬声器自适应声学模型(400)的特征向量。