Intra-language statistical machine translation
    1.
    发明授权
    Intra-language statistical machine translation 有权
    语言间统计机器翻译

    公开(公告)号:US08615388B2

    公开(公告)日:2013-12-24

    申请号:US12058328

    申请日:2008-03-28

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2818 G06F17/2827

    摘要: Training data may be provided, the training data including pairs of source phrases and target phrases. The pairs may be used to train an intra-language statistical machine translation model, where the intra-language statistical machine translation model, when given an input phrase of text in the human language, can compute probabilities of semantic equivalence of the input phrase to possible translations of the input phrase in the human language. The statistical machine translation model may be used to translate between queries and listings. The queries may be text strings in the human language submitted to a search engine. The listing strings may be text strings of formal names of real world entities that are to be searched by the search engine to find matches for the query strings.

    摘要翻译: 可以提供训练数据,训练数据包括源短语和目标短语对。 这些对可以用于训练语言间统计机器翻译模型,其中语言内统计机器翻译模型在给予人类语言的文本的输入短语时可以计算输入短语的语义等同性的可能性 输入短语在人类语言中的翻译。 统计机器翻译模型可用于在查询和列表之间进行翻译。 查询可以是提交给搜索引擎的人类语言中的文本字符串。 列表字符串可以是要由搜索引擎搜索以查找查询字符串的匹配的真实世界实体的正式名称的文本串。

    INTRA-LANGUAGE STATISTICAL MACHINE TRANSLATION
    2.
    发明申请
    INTRA-LANGUAGE STATISTICAL MACHINE TRANSLATION 有权
    语言统计机翻译

    公开(公告)号:US20090248422A1

    公开(公告)日:2009-10-01

    申请号:US12058328

    申请日:2008-03-28

    IPC分类号: G10L11/00 G06F17/28

    CPC分类号: G06F17/2818 G06F17/2827

    摘要: Training data may be provided, the training data including pairs of source phrases and target phrases. The pairs may be used to train an intra-language statistical machine translation model, where the intra-language statistical machine translation model, when given an input phrase of text in the human language, can compute probabilities of semantic equivalence of the input phrase to possible translations of the input phrase in the human language. The statistical machine translation model may be used to translate between queries and listings. The queries may be text strings in the human language submitted to a search engine. The listing strings may be text strings of formal names of real world entities that are to be searched by the search engine to find matches for the query strings.

    摘要翻译: 可以提供训练数据,训练数据包括源短语和目标短语对。 这些对可以用于训练语言间统计机器翻译模型,其中语言内统计机器翻译模型在给予人类语言的文本的输入短语时可以计算输入短语的语义等同性的可能性 输入短语在人类语言中的翻译。 统计机器翻译模型可用于在查询和列表之间进行翻译。 查询可以是提交给搜索引擎的人类语言中的文本字符串。 列表字符串可以是要由搜索引擎搜索以查找查询字符串的匹配的真实世界实体的正式名称的文本串。

    Voice aware demographic personalization
    3.
    发明授权
    Voice aware demographic personalization 有权
    语音感知人口统计

    公开(公告)号:US07949526B2

    公开(公告)日:2011-05-24

    申请号:US11810086

    申请日:2007-06-04

    IPC分类号: G10L15/00

    摘要: A voice interaction system is configured to analyze an utterance and identify inherent attributes that are indicative of a demographic characteristic of the system user that spoke the utterance. The system then selects and presents a personalized response to the user, the response being selected based at least in part on the identified demographic characteristic. In one embodiment, the demographic characteristic is one or more of the caller's age, gender, ethnicity, education level, emotional state, health status and geographic group. In another embodiment, the selection of the response is further based on consideration of corroborative caller data.

    摘要翻译: 语音交互系统被配置为分析话语并且识别指示说话的系统用户的人口特征的固有属性。 然后,系统选择并向用户呈现个性化的响应,至少部分地基于所识别的人口特征来选择响应。 在一个实施例中,人口特征是呼叫者的年龄,性别,种族,教育水平,情绪状态,健康状况和地理组中的一个或多个。 在另一个实施例中,响应的选择进一步基于对确认呼叫者数据的考虑。

    Automated data cleanup by substitution of words of the same pronunciation and different spelling in speech recognition
    4.
    发明授权
    Automated data cleanup by substitution of words of the same pronunciation and different spelling in speech recognition 有权
    通过替换相同发音和语音识别中不同拼写的单词进行自动数据清理

    公开(公告)号:US09460708B2

    公开(公告)日:2016-10-04

    申请号:US12561521

    申请日:2009-09-17

    摘要: The described implementations relate to automated data cleanup. One system includes a language model generated from language model seed text and a dictionary of possible data substitutions. This system also includes a transducer configured to cleanse a corpus utilizing the language model and the dictionary. The transducer can process speech recognition data in some cases by substituting a second word for a first word which shares pronunciation with the first word but is spelled differently. In some cases, this can be accomplished by establishing corresponding probabilities of the first word and second word based on a third word that appears in sequence with the first word.

    摘要翻译: 所描述的实现涉及自动数据清理。 一个系统包括从语言模型种子文本生成的语言模型和可能的数据替换的字典。 该系统还包括配置成利用语言模型和词典清理语料库的换能器。 在某些情况下,换能器可以处理语音识别数据,通过将第二个单词替换为与第一个单词共享发音但拼写不同的第一个单词。 在一些情况下,这可以通过基于与第一个单词顺序出现的第三个单词建立第一个单词和第二个单词的相应概率来实现。

    Presenting search results according to query domains

    公开(公告)号:US09684741B2

    公开(公告)日:2017-06-20

    申请号:US12479371

    申请日:2009-06-05

    IPC分类号: G06F17/30 G10L15/26 G06N99/00

    摘要: A query may be applied against search engines that respectively return a set of search results relating to various items discovered in the searched data sets. However, presenting numerous and varied search results may be difficult on mobile devices with small displays and limited computational resources. Instead, search results may be associated with search domains representing various information types (e.g., contacts, public figures, places, projects, movies, music, and books) and presented by grouping search results with associated query domains, e.g., in a tabbed user interface. The query may be received through an input device associated with a particular input domain, and may be transitioned to the query domain of a particular search engine (e.g., by recognizing phonemes of a voice query using an acoustic model; matching phonemes with query terms according to a pronunciation model; and generating a recognition result according to a vocabulary of an n-gram language model.)

    PRESENTING SEARCH RESULTS ACCORDING TO QUERY DOMAINS
    6.
    发明申请
    PRESENTING SEARCH RESULTS ACCORDING TO QUERY DOMAINS 有权
    根据查询域提供搜索结果

    公开(公告)号:US20100312782A1

    公开(公告)日:2010-12-09

    申请号:US12479371

    申请日:2009-06-05

    IPC分类号: G06F17/30 G10L15/26

    摘要: A query may be applied against search engines that respectively return a set of search results relating to various items discovered in the searched data sets. However, presenting numerous and varied search results may be difficult on mobile devices with small displays and limited computational resources. Instead, search results may be associated with search domains representing various information types (e.g., contacts, public figures, places, projects, movies, music, and books) and presented by grouping search results with associated query domains, e.g., in a tabbed user interface. The query may be received through an input device associated with a particular input domain, and may be transitioned to the query domain of a particular search engine (e.g., by recognizing phonemes of a voice query using an acoustic model; matching phonemes with query terms according to a pronunciation model; and generating a recognition result according to a vocabulary of an n-gram language model.)

    摘要翻译: 可以针对分别返回与搜索到的数据集中发现的各种项目相关的一组搜索结果的搜索引擎应用查询。 然而,在具有小显示器和有限的计算资源的移动设备上呈现多种多样的搜索结果可能是困难的。 相反,搜索结果可以与表示各种信息类型(例如,联系人,公众人物,地点,项目,电影,音乐和书籍)的搜索域相关联,并且通过将搜索结果与相关联的查询域分组,例如在标签用户 接口。 可以通过与特定输入域相关联的输入设备来接收查询,并且可以将其转换到特定搜索引擎的查询域(例如,通过使用声学模型识别语音查询的音素;使用查询词语匹配音素 发音模型;以及根据n-gram语言模型的词汇生成识别结果。)

    DYNAMICALLY ADDING PERSONALIZATION FEATURES TO LANGUAGE MODELS FOR VOICE SEARCH
    7.
    发明申请
    DYNAMICALLY ADDING PERSONALIZATION FEATURES TO LANGUAGE MODELS FOR VOICE SEARCH 有权
    动态添加个性化功能语言模型语音搜索

    公开(公告)号:US20120316877A1

    公开(公告)日:2012-12-13

    申请号:US13158453

    申请日:2011-06-12

    IPC分类号: G10L15/04

    CPC分类号: G10L15/197 G10L15/07

    摘要: A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.

    摘要翻译: 基于用户的使用历史,用户每个话语不断调整动态指数,基于特征的语言模型。 模型的这种调整是每个用户逐步完成的,在大量用户中,每个用户都有独特的历史记录。 用户历史可以包括先前识别的话语,文本查询和其他用户输入。 处理用户的历史数据以导出特征。 然后将这些功能动态地添加到该用户的语言模型中。

    Speech processing with predictive language modeling
    8.
    发明授权
    Speech processing with predictive language modeling 有权
    语言处理与预测语言建模

    公开(公告)号:US08145484B2

    公开(公告)日:2012-03-27

    申请号:US12268447

    申请日:2008-11-11

    申请人: Geoffrey Zweig

    发明人: Geoffrey Zweig

    IPC分类号: G10L15/00

    摘要: The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user.

    摘要翻译: 所描述的实现涉及用户的语音拼写。 一种方法识别可以匹配用户话语的一个或多个符号,并显示用户确认的单个符号。

    Task specific code generation for speech recognition decoding
    9.
    发明申请
    Task specific code generation for speech recognition decoding 审中-公开
    用于语音识别解码的任务特定代码生成

    公开(公告)号:US20050033576A1

    公开(公告)日:2005-02-10

    申请号:US10637219

    申请日:2003-08-08

    IPC分类号: G06F9/44 G10L15/28 G10L15/00

    CPC分类号: G06F8/30 G10L15/28

    摘要: A code generation program is provided that reads in the task-specific parameters of a speech recognition system and produces a source-language decoder program that is specialized to these parameters. The decoder program is then compiled and distributed. The process of profile-driven code optimization may be used to further enhance the output program. For ease of distribution, the system may be compiled in several parts, and assembled (linked) later, for example through the mechanism of dynamically loaded libraries.

    摘要翻译: 提供了代码生成程序,其读入语音识别系统的任务特定参数并产生专门针对这些参数的源语言解码器程序。 然后解码器程序被编译和分发。 轮廓驱动代码优化的过程可用于进一步增强输出程序。 为了便于分发,系统可以在几个部分进行编译,并在以后组合(链接),例如通过动态加载的库的机制。

    Speech Processing
    10.
    发明申请
    Speech Processing 有权
    语音处理

    公开(公告)号:US20100121639A1

    公开(公告)日:2010-05-13

    申请号:US12268447

    申请日:2008-11-11

    申请人: Geoffrey Zweig

    发明人: Geoffrey Zweig

    IPC分类号: G10L15/00 G10L15/18 G10L21/00

    摘要: The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user.

    摘要翻译: 所描述的实现涉及用户的语音拼写。 一种方法识别可以匹配用户话语的一个或多个符号,并显示用户确认的单个符号。