Speech synthesis method
    1.
    发明授权
    Speech synthesis method 失效
    语音合成方法

    公开(公告)号:US07130799B1

    公开(公告)日:2006-10-31

    申请号:US09684331

    申请日:2000-10-10

    IPC分类号: G10L13/00

    CPC分类号: G10L13/04 G10L13/07

    摘要: A speech synthesizing method which synthesizes speech naturally is disclosed. Standardized frame power values of an n-th frame is calculated when frame power values at head and tail frames in a phoneme are standardized. An average value of the power values sampled from the power frequency characteristics in the n-th frame at a predetermined frequency interval is set as a mean frame power value. A sum of squares of signal levels in one frame of a frequency signal from a sound source is calculated as a frame power correction value. A speech envelope signal is calculated as a function having variables of the standardized frame power values, the frame power correction value and the mean frame power value. The speech envelope signal adjusts the amplitude level of a speech waveform signal supplied from a vocal tract filter according to the level of the speech envelope signal.

    摘要翻译: 公开了一种自然合成语音的语音合成方法。 在音素中的头部和尾部帧的帧功率值被标准化时,计算第n帧的标准化帧功率值。 以预定频率间隔从第n帧中的功率频率特性采样的功率值的平均值被设置为平均帧功率值。 将来自声源的频率信号的一帧中的信号电平的平方和计算为帧功率校正值。 计算语音包络信号作为具有标准化帧功率值,帧功率校正值和平均帧功率值的变量的函数。 语音包络信号根据语音包络信号的电平来调节从声道滤波器提供的语音波形信号的幅度电平。

    Digital signal processing device and audio apparatus using the same
    2.
    发明授权
    Digital signal processing device and audio apparatus using the same 失效
    数字信号处理装置和使用其的音频装置

    公开(公告)号:US5255323A

    公开(公告)日:1993-10-19

    申请号:US880302

    申请日:1992-05-05

    IPC分类号: G01R23/16 H03G3/00 H03G5/00

    CPC分类号: G01R23/16 H03G3/002 H03G5/005

    摘要: A digital signal processing device for outputting a holding data in an output register from a DSP in synchronism with a second clock pulse having a frequency lower than that of a first clock pulse for conducting arithmetic processing in the DSP. Accordingly, data to be output from the DSP can be directly read by a microcomputer, and contents in a coefficient memory and a delay time memory, for example, can be updated in accordance with the read data. Further, the digital signal processing device can be applied to an audio apparatus such as a loudness controller and a spectrum indicating apparatus.

    摘要翻译: 一种数字信号处理装置,用于与具有低于第一时钟脉冲的频率的第二时钟脉冲同步地从DSP输出输出寄存器中的保持数据,以进行DSP中的算术处理。 因此,可以由微处理器直接读取从DSP输出的数据,例如可以根据读取的数据更新系数存储器和延迟时间存储器中的内容。 此外,数字信号处理装置可以应用于诸如响度控制器和频谱指示装置的音频装置。

    Apparatus and method for speech recognition
    3.
    发明授权
    Apparatus and method for speech recognition 失效
    用于语音识别的装置和方法

    公开(公告)号:US07257532B2

    公开(公告)日:2007-08-14

    申请号:US10667150

    申请日:2003-09-22

    申请人: Soichi Toyama

    发明人: Soichi Toyama

    IPC分类号: G10L15/00

    CPC分类号: G10L15/07 G10L15/20

    摘要: Before executing a speech recognition, a composite acoustic model adapted to noise is generated by composition of a noise adaptive representative acoustic model generated by noise-adaptation of each representative acoustic model and difference models stored in advance in a storing section, respectively. Then, the noise and speaker adaptive acoustic model is generated by executing speaker-adaptation to the composite acoustic model with the feature vector series of uttered speech. The renewal difference model is generated by the difference between the noise and speaker adaptive acoustic model and the noise adaptive representative acoustic model, to replace the difference model stored in the storing section therewith. The speech recognition is performed by comparing the feature vector series of the uttered speech to be recognized with the composite acoustic model adapted to noise and speaker generated by the composition of the noise adaptive representative acoustic model and the renewal difference model.

    摘要翻译: 在执行语音识别之前,通过分别由存储部分中预先存储的每个代表性声学模型和差分模型的噪声自适应产生的噪声自适应代表性声学模型的组合来生成适于噪声的复合声学模型。 然后,通过使用发声语音的特征向量序列对复合声学模型执行扬声器适应性来生成噪声和扬声器自适应声学模型。 更新差异模型由噪声和扬声器自适应声学模型与噪声自适应代表声学模型之间的差异产生,以代替存储在存储部分中的差分模型。 语音识别是通过将被识别的发声语音的特征向量序列与由噪声自适应代表声学模型和更新差分模型的组合产生的适合于噪声和扬声器的复合声学模型进行比较来执行的。

    Voice recognition system
    4.
    发明申请
    Voice recognition system 审中-公开
    语音识别系统

    公开(公告)号:US20050091053A1

    公开(公告)日:2005-04-28

    申请号:US10995509

    申请日:2004-11-24

    CPC分类号: G10L25/78

    摘要: A trained vector creating part 15 creates a characteristic of an unvoiced sound in advance as a trained vector V. Meanwhile, a threshold value THD for distinguishing a voice from a background sound is created based on a predictive residual power ε of a sound which is created during a non-voice period. As a voice is actually uttered, an inner product computation part 18 calculates an inner product of a feature vector A of an input signal Sa and a trained vector V, and a first threshold value judging part 19 judges that it is a voice section when the inner product has a value which is equal to or larger than a predetermined value θ while a second threshold value judging part 21 judges that it is a voice section when the predictive residual power ε of the input signal Sa is larger than a threshold value THD. As at least one of the first threshold value judging part 19 and the second threshold value judging part 21 judges that it is a voice section, a voice section determining part 300 finally judges that it is a voice section and cuts out an input signal Saf which are in units of frames and corresponds to this voice section as a voice Svc which is to be recognized.

    摘要翻译: 经训练的矢量创建部分15预先创建无声声音的特性作为训练矢量V.同时,基于产生的声音的预测剩余功率ε创建用于区分语音与背景声音的阈值THD 在非语音期间。 由于实际上发出声音,内积计算部18计算输入信号Sa的特征矢量A和训练矢量V的内积,第一阈值判定部19判断为声音部时, 内积具有等于或大于预定值θ的值,而当输入信号Sa的预测残余功率ε大于阈值THD时,第二阈值判断部21判断为语音区。 由于第一阈值判定部19和第二阈值判定部21中的至少一个判断为声音部,所以语音部确定部300最终判断为声音部,切断输入信号Saf, 是以帧为单位,并且对应于该声音部分作为要识别的声音Svc。

    Acoustic signal processing unit
    5.
    发明授权
    Acoustic signal processing unit 失效
    声信号处理单元

    公开(公告)号:US5444784A

    公开(公告)日:1995-08-22

    申请号:US64804

    申请日:1993-05-21

    申请人: Soichi Toyama

    发明人: Soichi Toyama

    IPC分类号: G10K15/12 H03G3/00

    CPC分类号: G10K15/12 Y10S84/26

    摘要: A sound echo machine as an acoustic signal processing unit of the present invention comprising an adder to which an input signal is fed, and a delay circuit for delaying the signal fed from the adder for a certain time to repeatedly feed back to the adder to generate an echo sound further comprises an input signal level detector for detecting the level of the input signal and sending it to a frequency oscillator to vary the oscillating frequency in accordance with the thus detected signal level for feeding it later to the delay circuit so as to modulate the time to be delayed at a predetermined cycle, whereby it can create an acoustic field in which a listener can feel as if various level of reflected sounds are coming towards him from various directions. On the other hand, a sound effecter as an acoustic signal processing unit comprising a plurality of acoustic signal processing sections, a plurality of attenuators each connected to these acoustic signal processing sections, and an adder for summing up all the signals from these attenuators further comprises a signal mixing ratio control section for monitoring the input acoustic signal level, and determining a signal mixing ratio among the respective output signals from the plurality of acoustic signal processing sections in accordance with the thus monitored level of the input acoustic signal, whereby even a simple structure can provide a specific sound effect.

    摘要翻译: 作为本发明的声音信号处理单元的声音回声机,包括输入信号被馈送的加法器和延迟电路,用于将从加法器馈送的信号延迟一定时间,以反复反馈给加法器,以产生 回波声音还包括输入信号电平检测器,用于检测输入信号的电平并将其发送到频率振荡器,以根据这样检测的信号电平来改变振荡频率,以便稍后将其馈送到延迟电路,以便调制 在预定的周期中被延迟的时间,从而可以产生声场,听众可以感觉到各种反射的声音的水平从各个方向到达他。 另一方面,作为声音信号处理单元的声音效果器包括多个声音信号处理部分,各个连接到这些声音信号处理部分的多个衰减器,以及用于对来自这些衰减器的所有信号进行求和的加法器,还包括 信号混合比控制部分,用于监测输入的声信号电平;以及根据由此监视的输入声信号的电平,确定来自多个声信号处理部分的各个输出信号之间的信号混合比,从而即使简单 结构可以提供特定的音效。

    OPERATOR RECOGNITION DEVICE, OPERATOR RECOGNITION METHOD AND OPERATOR RECOGNITION PROGRAM
    6.
    发明申请
    OPERATOR RECOGNITION DEVICE, OPERATOR RECOGNITION METHOD AND OPERATOR RECOGNITION PROGRAM 失效
    操作员识别装置,操作者识别方法和操作者识别程序

    公开(公告)号:US20090254757A1

    公开(公告)日:2009-10-08

    申请号:US11910415

    申请日:2006-03-24

    IPC分类号: G06F21/00 G06K9/00 G10L17/00

    CPC分类号: G10L17/16 G10L17/10

    摘要: An operator recognition device is provided that eliminates the registration of data such as HMM data having a characteristic amount for which error in recognition occurs easily when recognizing an operator, and thus reduces the possibility of errors in recognition, and has stable recognition performance. When registering HMM data that is used when performing recognition processing, a speaker recognition device 100 eliminates the registration of HMM data of a password having a characteristic amount of the spoken voice component that is similar to a characteristic amount that is indicated by HMM data that is already registered, and does not allow the registration of HMM data for which it is estimated that error in recognition will occur easily during the recognition process.

    摘要翻译: 提供了一种操作者识别装置,其消除了在识别操作者时容易识别出具有识别错误的特征量的HMM数据的登记,从而降低识别错误的可能性,并且具有稳定的识别性能。 当注册执行识别处理时使用的HMM数据时,说话人识别装置100消除了具有与由HMM数据表示的特征量相似的口语语音成分的特征量的密码的HMM数据的注册, 已经注册,并且不允许HMM数据的注册,估计在识别过程中容易发生识别错误。

    Speech Recognition Device and Speech Recognition Method
    7.
    发明申请
    Speech Recognition Device and Speech Recognition Method 有权
    语音识别装置及语音识别方法

    公开(公告)号:US20080270127A1

    公开(公告)日:2008-10-30

    申请号:US11547322

    申请日:2005-03-15

    IPC分类号: G10L21/02 G10L15/20

    摘要: There is provided a voice recognition device and a voice recognition method that enhance the function of noise adaptation processing in voice recognition processing and reduce the capacity of a memory being used. Acoustic models are subjected to clustering processing to calculate the centroid of each cluster and the differential vector between the centroid and each model, model composition between each kind of assumed noise model and the calculated centroid is carried out, and the centroid of each composition model and the differential vector are stored in a memory. In the actual recognition processing, the centroid optimal to the environment estimated by the utterance environmental estimation is extracted from the memory, model restoration is carried out on the extracted centroid by using the differential vector stored in the memory, and noise adaptation processing is executed on the basis of the restored model.

    摘要翻译: 提供了一种语音识别装置和语音识别方法,其增强了语音识别处理中噪声适应处理的功能,并降低了正在使用的存储器的容量。 对声学模型进行聚类处理,计算每个聚类的质心和质心与每个模型之间的差分向量,进行各种假设噪声模型与计算出的质心之间的模型组合,以及每个组合模型的质心和 差分矢量存储在存储器中。 在实际识别处理中,从存储器中提取通过语音环境估计估计的对环境最佳的质心,通过使用存储在存储器中的差分矢量对所提取的质心进行模型恢复,并且执行噪声适应处理 恢复模式的基础。

    Speech recognition system with an adaptive acoustic model
    8.
    发明授权
    Speech recognition system with an adaptive acoustic model 失效
    具有自适应声学模型的语音识别系统

    公开(公告)号:US07065488B2

    公开(公告)日:2006-06-20

    申请号:US09964677

    申请日:2001-09-28

    IPC分类号: G10L15/28 G10L15/20 G10L21/02

    摘要: At the time of the speaker adaptation, first feature vector generation sections (7, 8, 9) generate a feature vector series [Ci, M] from which the additive noise and multiplicative noise are removed. A second feature vector generation section (12) generates a feature vector series [Si, M] including the features of the additive noise and multiplicative noise. A path search section (10) conducts a path search by comparing the feature vector series [Ci, m] to the standard vector [an, m] of the standard voice HMM (300). When the speaker adaptation section (11) conducts correlation operation on an average feature vector [S^n, m] of the standard vector [an, m] corresponding to the path search result Dv and the feature vector series [Si, m], the adaptive vector [xn, m] is generated. The adaptive vector [xn, m] updates the feature vector of the speaker adaptive acoustic model (400) used for the speech recognition.

    摘要翻译: 在说话者适应时,第一特征向量生成部分(7,8,9)生成除去附加噪声和乘法噪声的特征向量序列[C i,M i]。 第二特征向量生成部(12)生成包括加性噪声和乘法噪声的特征的特征矢量序列[S i,i,M]。 路径搜索部分(10)通过将特征向量序列[C i,i,m]与标准的标准矢量[a N,m,]进行比较来进行路径搜索 语音HMM(300)。 当扬声器适配部分(11)针对对应于该信号的标准矢量[a,n,m]的平均特征矢量[S ^ N,m]]进行相关运算时 路径搜索结果Dv和特征向量序列[S i,m,]生成自适应向量[x N,m N]。 自适应矢量[x N,m N]更新用于语音识别的扬声器自适应声学模型(400)的特征向量。

    Voice recognition system
    9.
    发明授权
    Voice recognition system 失效
    语音识别系统

    公开(公告)号:US06937981B2

    公开(公告)日:2005-08-30

    申请号:US09954151

    申请日:2001-09-18

    摘要: A multiplicative distortion Hm(cep) is subtracted from a voice HMM 5, a multiplicative distortion Ha(cep) of the uttered voice is subtracted from a noise HMM 6 formed by HMM, and the subtraction results Sm(cep) and {Nm(cep)−Ha (cep)} are combined with each other to thereby form a combined HMM 18 in the cepstrum domain. A cepstrum R^a(cep) obtained by subtracting the multiplicative distortion Ha (cep) from the cepstrum Ra (cep) of the uttered voice is compared with the distribution R^m(cep) of the combined HMM 18 in the cepstrum domain, and the combined HMM with the maximum likelihood is output as the voice recognition result.

    摘要翻译: 从语音HMM 5中减去乘法失真Hm(cep),从由HMM形成的噪声HMM6中减去所发出的语音的乘法失真Ha(cep),并且减法结果Sm(cep)和{Nm (cep)通过从发射的倒谱中的倒谱谱(cep)中减去乘法失真Ha(cep)而得到的倒谱R ^ a(cep) 将语音与倒谱域中组合HMM18的分布R ^ m(cep)进行比较,并输出具有最大似然性的组合HMM作为语音识别结果。

    Pitch control apparatus for setting coefficients for cross-fading
operation in accordance with intervals between write address and a
number of read addresses in a sampling cycle
    10.
    发明授权
    Pitch control apparatus for setting coefficients for cross-fading operation in accordance with intervals between write address and a number of read addresses in a sampling cycle 失效
    用于根据写入地址和采样周期中的读取地址数之间的间隔来设置用于交叉衰落操作的系数的间距控制装置

    公开(公告)号:US5522010A

    公开(公告)日:1996-05-28

    申请号:US425226

    申请日:1995-04-18

    申请人: Soichi Toyama

    发明人: Soichi Toyama

    摘要: A pitch control apparatus which suppresses the occurrence of a tremolo tone which the interval control is performed. Input audio signal data is written at a memory position at a designated writing address in a memory in a predetermined order for every sampling cycle, a plurality of reading addresses of the memory are designated for every sampling cycle, and are set in a different order from the predetermined order for each cycle which is a multiple of the sampling cycle by a predetermined multiplier, data is read from memory positions of designated plurality of reading addresses in the memory, a coefficient is set in accordance with an address interval between the writing address and each of the designated plurality of reading addresses in the memory, the data read out at the plurality of reading addresses are multiplied by the associated coefficients, and the results are added together as output data. The maximum value of interval between each of the plurality of reading addresses, Dmax, is set asDmax=Tdmax/{(1-(1/Jn)).multidot.T.sub.0 }when the pitch is to be raised, and set asDmax=Tdmax/{(1+(1/Jn)).multidot.T.sub.0 }when the pitch is to be lowered,where T.sub.0 denotes the sampling cycle of the input audio signal data, Jn denotes how may times a cycle for skipping sampling data or reading sampling data twice should be longer than the sampling cycle T.sub.0, and Tdmax denotes an allowable time for a time-dependent data shift between the plurality of reading addresses, and the allowable time is set 45 to 80 msec by which the reverberation phenomenon is not remarkably disturbing.

    摘要翻译: 一种音调控制装置,其抑制执行间隔控制的颤音的发生。 在每个采样周期以预定顺序将输入音频信号数据以指定的写入地址写入存储器中的存储器位置,每个采样周期指定存储器的多个读取地址,并且以与 通过预定乘法器作为采样周期的倍数的每个周期的预定顺序,从存储器中指定的多个读取地址的存储器位置读取数据,根据写入地址和写入地址之间的地址间隔设置系数 存储器中的指定多个读取地址中的每一个,将在多个读取地址读出的数据乘以相关联的系数,并将结果相加在一起作为输出数据。 多个读取地址Dmax中的每一个之间的间隔的最大值Dmax被设置为Dmax = Tdmax / {(1-(1 / Jn))×T0},并且设定为Dmax = Tdmax / 当要降低音调时,{(1+(1 / Jn))xT0},其中T0表示输入音频信号数据的采样周期,Jn表示跳过采样数据或读取采样数据两次的周期时间 比采样周期T0长,并且Tdmax表示多个读取地址之间的时间相关数据移位的允许时间,并且允许时间被设置为45至80毫秒,由此混响现象不会显着地受到干扰。