WARPED SPECTRAL AND FINE ESTIMATE AUDIO ENCODING

    公开(公告)号:WO2012075476A3

    公开(公告)日:2012-06-07

    申请号:PCT/US2011/063196

    申请日:2011-12-03

    Abstract: A warped spectral estimate of an original audio signal can be used to encode a representation of a fine estimate of the original signal. The representation of the warped spectral estimate and the representation of the fine estimate can be sent to a speech recognition system. The representation of the warped spectral estimate can be passed to a speech recognition engine, where it may be used for speech recognition. The representation of the warped spectral estimate can also be used along with the representation of the fine estimate to reconstruct a representation of the original audio signal.

    MULTI-SENSORY SPEECH ENHANCEMENT USING A SPEECH-STATE MODEL
    2.
    发明申请
    MULTI-SENSORY SPEECH ENHANCEMENT USING A SPEECH-STATE MODEL 审中-公开
    使用语音模型进行多重感知语音增强

    公开(公告)号:WO2007001821A2

    公开(公告)日:2007-01-04

    申请号:PCT/US2006/022863

    申请日:2006-06-13

    CPC classification number: G10L21/0208 G10L2021/02165

    Abstract: A method and apparatus determine a likelihood of a speech state based on an alternative sensor signal and an air conduction microphone signal. The likelihood of the speech state is used, together with the alternative sensor signal and the air conduction microphone signal, to estimate a clean speech value for a clean speech signal.

    Abstract translation: 方法和装置基于替代传感器信号和空气传导麦克风信号确定语音状态的可能性。 使用语音状态的可能性以及替代的传感器信号和导气麦克风信号来估计干净的语音信号的清晰的语音值。

    MULTI-SENSORY SPEECH ENHANCEMENT USING A CLEAN SPEECH PRIOR
    4.
    发明申请
    MULTI-SENSORY SPEECH ENHANCEMENT USING A CLEAN SPEECH PRIOR 审中-公开
    使用清洁语音前的多感知语音增强

    公开(公告)号:WO2007001768A2

    公开(公告)日:2007-01-04

    申请号:PCT/US2006/022058

    申请日:2006-06-06

    CPC classification number: H04R3/005 G10L21/0208 H04R2460/13

    Abstract: A method and apparatus determine a channel response for an alternative sensor using an alternative sensor signal, an air conduction microphone signal. The channel response and a prior probability distribution for clean speech values are then used to estimate a clean speech value.

    Abstract translation: 方法和装置使用替代传感器信号,空气传导麦克风信号确定替代传感器的信道响应。 然后使用信道响应和干净语音值的先验概率分布来估计干净的语音值。

    COMBINED SPEECH AND ALTERNATE INPUT MODALITY TO A MOBILE DEVICE
    5.
    发明申请
    COMBINED SPEECH AND ALTERNATE INPUT MODALITY TO A MOBILE DEVICE 审中-公开
    组合的语音和替代输入模式到移动设备

    公开(公告)号:WO2007053294A1

    公开(公告)日:2007-05-10

    申请号:PCT/US2006/040537

    申请日:2006-10-16

    CPC classification number: G10L15/22

    Abstract: Both speech and alternate modality inputs are used in inputting information spoken into a mobile device. The alternate modality inputs can be used to perform sequential commitment of words in a speech recognition result.

    Abstract translation: 语音和替代模式输入都用于输入移动设备中的信息。 替代模态输入可以用于在语音识别结果中执行单词的顺序承诺。

    METHOD OF DETERMINING UNCERTAINTY ASSOCIATED WITH NOISE REDUCTION
    6.
    发明申请
    METHOD OF DETERMINING UNCERTAINTY ASSOCIATED WITH NOISE REDUCTION 审中-公开
    确定与噪声相关的不确定度的方法

    公开(公告)号:WO2003100769A1

    公开(公告)日:2003-12-04

    申请号:PCT/US2003/016032

    申请日:2003-05-20

    CPC classification number: G10L15/20 G10L21/0208

    Abstract: A method and apparatus are provided for determining uncertainty in noise reduction based on a parametric model of speech distortion. The method is first used to reduce noise in a noisy signal. In particular, noise is reduced (304) from a representation of a portion of a noisy signal to produce a representation of a cleaned signal by utilizing an acoustic environment model (413). The uncertainty associated with the noise reduction process is then computed. In one embodiment, the uncertainty of the noise reduction process is used, in conjunction with the noise-reduced signal, to decode (306) a pattern state.

    Abstract translation: 提供了一种基于语音失真的参数模型来确定降噪中的不确定性的方法和装置。 该方法首先用于降低噪声信号中的噪声。 特别地,从噪声信号的一部分的表示中减少噪声(304),以通过利用声学环境模型(413)产生清洁信号的表示。 然后计算与降噪过程相关的不确定性。 在一个实施例中,噪声降低处理的不确定性与噪声降低信号结合使用以解码(306)模式状态。

    GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA
    7.
    发明申请
    GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA 审中-公开
    使用声学数据的图形到电声转换

    公开(公告)号:WO2009075990A1

    公开(公告)日:2009-06-18

    申请号:PCT/US2008/083249

    申请日:2008-11-12

    CPC classification number: G10L13/08 G10L15/063 G10L15/187

    Abstract: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

    Abstract translation: 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和辨别性训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。

    INDEXING AND SEARCHING SPEECH WITH TEXT META-DATA
    8.
    发明申请
    INDEXING AND SEARCHING SPEECH WITH TEXT META-DATA 审中-公开
    使用文本元数据进行索引和搜索

    公开(公告)号:WO2007056032A1

    公开(公告)日:2007-05-18

    申请号:PCT/US2006/042733

    申请日:2006-10-31

    CPC classification number: G06F17/30778 G06F17/30746 G06F17/30749 G10L15/197

    Abstract: An index for searching spoken documents having speech data and text meta-data is created by obtaining probabilities of occurrence of words and positional information of the words of the speech data and combining it with at least positional information of the words in the text meta-data. A single index can be created because the speech data and the text meta-data are treated the same and considered only different categories .

    Abstract translation: 用于搜索具有语音数据和文本元数据的口头文档的索引是通过获得词语出现的概率和语音数据的单词的位置信息并将其与文本元数据中的单词的至少位置信息 。 可以创建单个索引,因为语音数据和文本元数据被视为相同,仅被认为是不同的类别。

    SPEECH INDEX PRUNING
    9.
    发明申请
    SPEECH INDEX PRUNING 审中-公开
    语音指数调整

    公开(公告)号:WO2007056029A1

    公开(公告)日:2007-05-18

    申请号:PCT/US2006/042723

    申请日:2006-10-31

    CPC classification number: G06F17/30778 G06F17/30746 G10L15/197

    Abstract: A speech segment is indexed by identifying at least two alternative word sequences for the speech segment. For each word in the alternative sequences, information is placed in an entry for the word in the index. Speech units are eliminated from entries in the index based on a comparison of a probability that the word appears in the speech segment and a threshold value.

    Abstract translation: 通过识别用于语音段的至少两个备选词序列来索引语音片段。 对于替代序列中的每个单词,信息被放置在索引中的单词的条目中。 基于词出现在语音片段中的概率与阈值的比较,从索引中的条目中消除语音单元。

    INTEGRATED SPEECH RECOGNITION AND SEMANTIC CLASSIFICATION
    10.
    发明申请
    INTEGRATED SPEECH RECOGNITION AND SEMANTIC CLASSIFICATION 审中-公开
    综合语音识别和语义分类

    公开(公告)号:WO2008089470A1

    公开(公告)日:2008-07-24

    申请号:PCT/US2008/051584

    申请日:2008-01-21

    CPC classification number: G10L15/1815

    Abstract: A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.

    Abstract translation: 一种新颖的系统集成了语音识别和语义分类,从而在训练语言模型和语义分类模型时,可以考虑接受讲话语音的语音识别器中的声学分数。 例如,可以定义联合关联分数,其表示声学信号的语义类别和单词序列的对应关系。 联合关联评分可以包括参数,例如声信号的信号到类建模的加权参数,语言模型参数和分数,以及声学模型参数和分数。 可以修改参数以相对于具有目标语义类的竞争者词序列的联合关联分数来提高目标词序列与目标语义类别的联合关联分数。 参数可以被设计成使训练数据中的语义分类误差最小化。

Patent Agency Ranking