Method and apparatus for discriminative estimation of parameters in maximum a posteriori (MAP) speaker adaptation condition and voice recognition method and apparatus including these
    11.
    发明申请
    Method and apparatus for discriminative estimation of parameters in maximum a posteriori (MAP) speaker adaptation condition and voice recognition method and apparatus including these 失效
    最大后验(MAP)说话者适应条件中的参数的鉴别估计方法和装置以及包括这些参数的语音识别方法和装置

    公开(公告)号:US20050065793A1

    公开(公告)日:2005-03-24

    申请号:US10898382

    申请日:2004-07-26

    IPC分类号: G10L15/07 G10L15/12 G10L19/12

    CPC分类号: G10L15/07

    摘要: A method and apparatus for discriminative estimation of parameters in a maximum a posteriori (MAP) speaker adaptation condition, and a voice recognition apparatus having the apparatus and a voice recognition method using the method are provided. The method for discriminative estimation of parameters in a maximum a posteriori (MAP) speaker adaptation condition, in which at least speaker-independent model parameters and prior density parameters, which are standards in recognizing a speaker's voice, are obtained as the result of model training after fetching training sets on a plurality of speakers from a training database, has the steps of (a) classifying adaptation data among training sets for respective speakers; (b) obtaining model parameters adapted from adaptation data on each speaker by using the initial values of the parameters; (c) searching a plurality of candidate hypotheses on each uttered sentence of training sets by using the adapted model parameters, and calculating gradients of speaker-independent model parameters by measuring the degree of errors on each training sentence; and (d) when training sets of all speakers are adapted, updating parameters, which were set at the initial stage, based on the calculated gradients.

    摘要翻译: 提供了一种用于鉴别性估计最大后验(MAP)说话者适应条件中的参数的方法和装置,以及具有使用该方法的装置和语音识别方法的语音识别装置。 作为模型训练的结果,获得最大后验(MAP)说话者适应条件中的参数的辨别性估计的方法,其中至少与说话者独立的模型参数和作为识别说话者的声音的标准的先前密度参数被获得 在从训练数据库获取多个扬声器上的训练集之后,具有以下步骤:(a)在适用于各个扬声器的训练集之间对适配数据进行分类; (b)通过使用参数的初始值从每个说话者的适应数据中获得适应的模型参数; (c)通过使用适应的模型参数来搜索训练集的每个发音句子上的多个候选假设,以及通过测量每个训练句子的错误程度来计算与说话者无关的模型参数的梯度; 和(d)当适应所有发言者的训练集时,根据计算的梯度更新在初始阶段设定的参数。

    Apparatus and method for transmitting sound
    12.
    发明授权
    Apparatus and method for transmitting sound 有权
    用于传输声音的装置和方法

    公开(公告)号:US06782106B1

    公开(公告)日:2004-08-24

    申请号:US09562890

    申请日:2000-05-01

    IPC分类号: H04R110

    摘要: An apparatus and method for transmitting sound are provided. The apparatus includes an external sound receiver for receiving external sounds and converting them into an external sound signal, a volume controller for outputting sound signals only if each of the volumes of the sound signal of a sound producing device and the sound signal of the external sound receiver exceeds a predetermined reference level, and a mixer for mixing the sound signal of the sound producing device with the sound signals output from the volume controller and outputting the result. The apparatus mixes ambient sounds having volume exceeding a certain volume with the sound of a sound producing device and transmits the mixed sounds to a pair of headphones which are a sound receiver for a user, thereby allowing the user to hear an ambient alarm sound while the user is listening to the sound of the sound producing device and making it possible for the user to audibly detect danger. Consequently, the apparatus provides user safety.

    摘要翻译: 提供了一种用于发送声音的装置和方法。 该装置包括用于接收外部声音并将其转换为外部声音信号的外部声音接收器,仅当声音产生装置的声音的每个音量和外部声音的声音信号才能输出声音信号的音量控制器 接收机超过预定参考电平,以及用于将声音产生装置的声音信号与从音量控制器输出的声音信号混合并输出结果的混合器。 该装置将音量超过一定音量的环境声音与发声装置的声音混合,并将混合声音发送到作为用户的声音接收器的一对耳机,从而允许用户听到环境警报声,同时 用户正在听声音产生装置的声音,并且使得用户可以可听见地检测危险。 因此,该设备提供用户的安全性。

    Speaker verification system and method using spoken continuous, random length digit string
    13.
    发明授权
    Speaker verification system and method using spoken continuous, random length digit string 失效
    扬声器验证系统和方法采用口语连续随机长度数字串

    公开(公告)号:US06496800B1

    公开(公告)日:2002-12-17

    申请号:US09562889

    申请日:2000-05-01

    IPC分类号: G10L1700

    CPC分类号: G10L17/24

    摘要: A speaker verification system using the voice of a user uttering a continuous, random length digit string is provided. The speaker verification system includes a random digit generator for generating a continuous, random length digit string; a user interface for providing the continuous, random length digit string; a feature extractor for extracting voice features from the user's voice uttering the continuous, random length digit string; a digit voice verification unit for comparing the voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and for determining whether the derived digit string is identical to the digit string provided to the user via the user interface; and a speaker verification unit for comparing the voice features with a speaker-dependent model of the user to measure the similarity between them. The speaker-dependent model of the user includes previously determined features of the users' voice and determines whether to approve or reject the user based on the similarity.

    摘要翻译: 提供了一种使用用户发出连续随机长度数字串的语音的扬声器验证系统。 扬声器验证系统包括用于产生连续的随机长度数字串的随机数字发生器; 用于提供连续的随机长度数字串的用户界面; 特征提取器,用于从用户的语音中提取语音特征,发出连续的随机长度数字串; 数字语音验证单元,用于将语音特征与不依赖于说话者的连续数字语音模型中的项目进行比较,以导出与声音特征匹配的与说话者无关的连续数字语音模型中的项目相对应的数字串,并且用于确定是否 派生数字串与通过用户界面提供给用户的数字串相同; 以及扬声器验证单元,用于将语音特征与用户的与说话者相关的模型进行比较,以测量它们之间的相似度。 用户的与扬声器相关的模型包括用户语音的先前确定的特征,并且基于相似性来确定是否批准或拒绝用户。

    System and method for human body communication
    14.
    发明申请
    System and method for human body communication 有权
    人体通讯系统与方法

    公开(公告)号:US20070190940A1

    公开(公告)日:2007-08-16

    申请号:US11517393

    申请日:2006-09-08

    IPC分类号: H04B5/00 H04B7/00

    摘要: A human body communication system. The human body communication system includes a controlled device measuring a capacitance that corresponds to the distance to a human body, and transmitting information on the measured capacitance through a wireless medium; and a control device receiving the information, and then, based on the information, determining a transmission power and, with the determined transmission power, transmitting a control command of a user to the controlled device using the human body as a medium.

    摘要翻译: 人体通讯系统 人体通信系统包括测量对应于人体距离的电容的受控设备,以及通过无线介质发送关于所测量电容的信息; 以及接收信息的控制装置,然后基于该信息,确定发送功率,并且利用所确定的发送功率,使用人体作为媒体将用户的控制命令发送到被控制装置。

    Method, medium, and system masking audio signals using voice formant information
    15.
    发明申请
    Method, medium, and system masking audio signals using voice formant information 审中-公开
    使用语音共振峰信息的方法,媒体和系统屏蔽音频信号

    公开(公告)号:US20070055513A1

    公开(公告)日:2007-03-08

    申请号:US11489549

    申请日:2006-07-20

    IPC分类号: G10L15/20

    CPC分类号: G10L2021/02087

    摘要: A method, medium, and system for masking voice information of a communication device. The method of masking a user's voice through an output of a masking signal similar to a formant of voice data may include dividing the voice data received into frames of a predetermined size, transforming the frames on a frequency axis thereof, regarded as a domain, obtaining formant information of intensive signal regions in the transformed frames, generating a sound signal disturbing the formant information with reference to the formant information, and outputting the sound signal in accordance with a time point when the voice signal is output.

    摘要翻译: 一种用于屏蔽通信设备的语音信息的方法,介质和系统。 通过类似于语音数据的共振峰的屏蔽信号的输出屏蔽用户的语音的方法可以包括将接收到的语音数据划分成预定大小的帧,将被认为是域的频率轴上的帧变换,获得 在变形帧中的密集信号区域的共振峰信息,参考共振峰信息产生干扰共振峰信息的声音信号,并根据语音信号输出的时间点输出声音信号。

    Speech enhancement method
    16.
    发明授权
    Speech enhancement method 有权
    语音增强方法

    公开(公告)号:US06778954B1

    公开(公告)日:2004-08-17

    申请号:US09572232

    申请日:2000-05-17

    IPC分类号: G10L2102

    CPC分类号: G10L21/0208

    摘要: A speech enhancement method, including the steps of: (a) segmenting an input speech signal into a plurality of frames and transforming each frame signal into a signal of the frequency domain; (b) computing the signal-to-noise ratio of a current frame, and computing signal-to-noise ratio of a frame immediately preceding the current frame; (c) computing the predicted signal-to-noise ratio of the current frame which is predicted based on the preceding frame and computing the speech absence probability using the signal-to-noise ratio and predicted signal-to-noise ratio of the current frame; (d) correcting the two signal-to-noise ratios obtained in the step (b) based on the speech absence probability computed in the step (c); (e) computing the gain of the current frame with the two corrected signal-to-noise ratios obtained in the step (d), and multiplying the speech spectrum of the current frame by the computed gain; (f) estimating the noise and speech power for the next frame to calculate the predicted signal-to-noise ratio for the next frame, and providing the predicted signal-to-noise ratio for the next frame as the predicted signal-to-noise ratio of the current frame for the step (c); and (g) transforming the result spectrum of the step (e) into a signal of the time domain. The noise spectrum is estimated in speech presence intervals based on the speech absence probability, as well as in speech absence intervals, and the predicted SNR and gain are updated on a per-channel basis of each frame according to the noise spectrum estimate, which in turn improves the speech spectrum in various noise environments.

    摘要翻译: 一种语音增强方法,包括以下步骤:(a)将输入语音信号分割为多个帧,并将每个帧信号变换成频域的信号; (b)计算当前帧的信噪比,以及计算紧邻当前帧之前的帧的信噪比; (c)计算基于前一帧预测的当前帧的预测信噪比,并使用当前帧的信噪比和预测信噪比来计算语音缺失概率 ; (d)基于步骤(c)中计算出的语音缺失概率来校正在步骤(b)中获得的两个信噪比; (e)利用步骤(d)中获得的两个校正的信噪比来计算当前帧的增益,并将当前帧的语音频谱乘以所计算的增益; (f)估计下一帧的噪声和语音功率,以计算下一帧的预测信噪比,并将下一帧的预测信噪比作为预测信噪比 步骤(c)的当前帧的比率; 和(g)将步骤(e)的结果谱变换成时域的信号。 基于语音不存在概率以及在无语音间隔中的语音存在间隔中估计噪声频谱,并且根据噪声频谱估计在每帧的每个信道的基础上更新预测的SNR和增益。 转动改善了各种噪音环境中的语音频谱。

    Encoding and decoding method for linear predictive coding (LPC)
coefficient
    17.
    发明授权
    Encoding and decoding method for linear predictive coding (LPC) coefficient 失效
    用于线性预测编码(LPC)系数的编码和解码方法

    公开(公告)号:US5822723A

    公开(公告)日:1998-10-13

    申请号:US710943

    申请日:1996-09-24

    IPC分类号: H03M7/30 G10L19/07 G10L3/02

    CPC分类号: G10L19/07

    摘要: A speech signal encoding/decoding method is provided. The method of encoding LPC coefficients includes dividing the nth-order line spectral frequencies into lower, middle and upper code vectors, quantizing the middle code vectors using a middle code book to generate a first index, selecting one of a plurality of lower code books according to the lowermost line spectral frequency of the middle code vector and the line spectral frequencies of the lower code vectors, and quantizing the lower code vectors using the selected lower code book to generate a second index, selecting one of a plurality of upper code books according to the uppermost line spectral frequency of the middle code vector and the line spectral frequencies of the upper code vectors, quantizing the upper code vectors using the selected upper code book to generate a third index, and transmitting the first, second and third indexes. In the above quantization, the line spectral frequencies are quantized using a linked split vector quantization (LSVQ), and the search of the code book is efficiently performed, so that the spectral distortion and outlier percentages are lower at 23 bits/frame than those of the split vector quantization (SVQ) at 24 bits/frame.

    摘要翻译: 提供语音信号编码/解码方法。 LPC系数的编码方法包括将n次线谱频率分为下,中,上码矢量,使用中间码本对中间码矢量进行量化,生成第一索引,根据 到中间码矢量的最低频谱频率和下码矢量的线频谱频率,并且使用所选择的较低码本对低码矢量进行量化以产生第二索引,从而选择多个上码本中的一个, 到中间码矢量的最上面的频谱频率和上部码矢量的线谱频率,使用所选择的上限码本对上代码矢量进行量化以产生第三索引,以及发送第一,第二和第三索引。 在上述量化中,使用链接的分割矢量量化(LSVQ)量化线谱频率,并且有效地执行码本的搜索,使得频谱失真和异常值百分比比23位/帧低 24位/帧的分割矢量量化(SVQ)。