System and method for identifying semantic intent from acoustic information
    2.
    发明授权
    System and method for identifying semantic intent from acoustic information 有权
    用于从声学信息中识别语义意图的系统和方法

    公开(公告)号:US07634406B2

    公开(公告)日:2009-12-15

    申请号:US11009630

    申请日:2004-12-10

    IPC分类号: G10L15/06

    CPC分类号: G10L15/19 G10L15/1815

    摘要: In accordance with one embodiment of the present invention, unanticipated semantic intents are discovered in audio data in an unsupervised manner. For instance, the audio acoustics are clustered based on semantic intent and representative acoustics are chosen for each cluster. The human then need only listen to a small number of representative acoustics for each cluster (and possibly only one per cluster) in order to identify the unforeseen semantic intents.

    摘要翻译: 根据本发明的一个实施例,以无监督的方式在音频数据中发现意外的语义意图。 例如,音频声学基于语义意图进行聚类,并为每个群集选择代表性的声学。 然后,人们只需要听每个群集的少量代表性声学(并且可能只有一个群集),以便识别不可预见的语义意图。

    Structured models of repetition for speech recognition
    3.
    发明授权
    Structured models of repetition for speech recognition 有权
    用于语音识别的重复结构化模型

    公开(公告)号:US08965765B2

    公开(公告)日:2015-02-24

    申请号:US12233826

    申请日:2008-09-19

    IPC分类号: G10L15/00 G10L15/18

    CPC分类号: G10L15/1822

    摘要: Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.

    摘要翻译: 描述了一种技术,通过该技术,部分地基于先前的话语,使用结构化重复模型来确定用户说出的单词和/或相应的数据库条目。 对于重复的话语,对由一个或多个识别器识别的相应字序列(和至少一些)和相关联的声学数据进行联合概率分析。 例如,可以在分析中使用生成概率模型或最大熵模型。 第二个发音可以是使用精确的单词或相对于第一个发音的其他结构变换的第一个发音的重复,例如添加一个或多个单词的扩展,删除一个或多个单词的截断或整个 或一个或多个单词的部分拼写。

    SEARCH LEXICON EXPANSION
    4.
    发明申请
    SEARCH LEXICON EXPANSION 有权
    搜索LEXICON EXPANSION

    公开(公告)号:US20120158703A1

    公开(公告)日:2012-06-21

    申请号:US12970477

    申请日:2010-12-16

    IPC分类号: G06F17/30

    摘要: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.

    摘要翻译: 公开了一种或多种技术和/或系统,用于创建用于基于搜索的语义标签中的扩展或改进的词典。 可以使用一组第一词典元素作为查询来识别一组第一文档,并且可以从该组第一文档中提取一个或多个第一文档图案。 文档模式可用于在查询日志中找到构成文档模式的一个或多个第二文档,这些文档模式与用于返回第二个文档的查询术语相关联。 可以提取和使用第二个文档的查询条款来扩展词典。 例如,词法中的元素可以基于与不同查询域的相关性来加权。

    Grapheme-to-phoneme conversion using acoustic data
    5.
    发明授权
    Grapheme-to-phoneme conversion using acoustic data 有权
    使用声学数据的语音对音素转换

    公开(公告)号:US08180640B2

    公开(公告)日:2012-05-15

    申请号:US13164683

    申请日:2011-06-20

    IPC分类号: G10L15/04

    摘要: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

    摘要翻译: 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和鉴别训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。

    Search lexicon expansion
    6.
    发明授权

    公开(公告)号:US09928296B2

    公开(公告)日:2018-03-27

    申请号:US12970477

    申请日:2010-12-16

    IPC分类号: G06F17/30 G06F17/27

    摘要: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.

    GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA
    7.
    发明申请
    GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA 有权
    使用声学数据的图形到电声转换

    公开(公告)号:US20090150153A1

    公开(公告)日:2009-06-11

    申请号:US11952267

    申请日:2007-12-07

    IPC分类号: G10L15/00

    摘要: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

    摘要翻译: 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和辨别性训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。

    Presenting search results according to query domains

    公开(公告)号:US09684741B2

    公开(公告)日:2017-06-20

    申请号:US12479371

    申请日:2009-06-05

    IPC分类号: G06F17/30 G10L15/26 G06N99/00

    摘要: A query may be applied against search engines that respectively return a set of search results relating to various items discovered in the searched data sets. However, presenting numerous and varied search results may be difficult on mobile devices with small displays and limited computational resources. Instead, search results may be associated with search domains representing various information types (e.g., contacts, public figures, places, projects, movies, music, and books) and presented by grouping search results with associated query domains, e.g., in a tabbed user interface. The query may be received through an input device associated with a particular input domain, and may be transitioned to the query domain of a particular search engine (e.g., by recognizing phonemes of a voice query using an acoustic model; matching phonemes with query terms according to a pronunciation model; and generating a recognition result according to a vocabulary of an n-gram language model.)

    Grapheme-to-phoneme conversion using acoustic data
    9.
    发明授权
    Grapheme-to-phoneme conversion using acoustic data 有权
    使用声学数据的语音对音素转换

    公开(公告)号:US07991615B2

    公开(公告)日:2011-08-02

    申请号:US11952267

    申请日:2007-12-07

    IPC分类号: G10L15/04

    摘要: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

    摘要翻译: 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和鉴别训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。

    STRUCTURED MODELS OF REPITITION FOR SPEECH RECOGNITION
    10.
    发明申请
    STRUCTURED MODELS OF REPITITION FOR SPEECH RECOGNITION 有权
    用于语音识别的结构化复制模型

    公开(公告)号:US20100076765A1

    公开(公告)日:2010-03-25

    申请号:US12233826

    申请日:2008-09-19

    IPC分类号: G10L15/00

    CPC分类号: G10L15/1822

    摘要: Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.

    摘要翻译: 描述了一种技术,通过该技术,部分地基于先前的话语,使用结构化重复模型来确定用户说出的单词和/或相应的数据库条目。 对于重复的话语,对由一个或多个识别器识别的相应字序列(和至少一些)和相关联的声学数据进行联合概率分析。 例如,可以在分析中使用生成概率模型或最大熵模型。 第二个发音可以是使用精确的单词或相对于第一个发音的其他结构变换的第一个发音的重复,例如添加一个或多个单词的扩展,删除一个或多个单词的截断或整个 或一个或多个单词的部分拼写。