TEMPLATE CONCATENATION FOR CAPTURING MULTIPLE CONCEPTS IN A VOICE QUERY
    1.
    发明申请
    TEMPLATE CONCATENATION FOR CAPTURING MULTIPLE CONCEPTS IN A VOICE QUERY 审中-公开
    用于在语音查询中捕获多种概念的模板化

    公开(公告)号:US20110314003A1

    公开(公告)日:2011-12-22

    申请号:US12817233

    申请日:2010-06-17

    IPC分类号: G06F17/30

    摘要: Architecture that provides the capability to identify which parts (terms and phrases) of a voice query have been covered by predefined phrase templates, and then to concatenate matching phrase templates into a new paraphrased query. A match-drop-continue algorithm is disclosed that progressively masks out the portions (phrases, terms) of the query matched to the phrase templates. Ultimately, the matched phrase templates are accumulated and organized together dynamically into a rephrased version of the original voice query. A user interface is provided that allows the user to confirm/summarize the multiple concepts in a progressive manner.

    摘要翻译: 提供识别语音查询的哪些部分(术语和短语)已经被预定义的短语模板覆盖的能力的架构,然后将匹配短语模板连接成新的释义查询。 公开了逐行掩蔽与短语模板匹配的查询的部分(短语,术语)的匹配 - 丢弃 - 继续算法。 最终,匹配的短语模板被积累并且一起动态地组织成原始语音查询的转换版本。 提供了一种用户界面,允许用户以渐进的方式确认/总结多个概念。

    Intra-language statistical machine translation
    2.
    发明授权
    Intra-language statistical machine translation 有权
    语言间统计机器翻译

    公开(公告)号:US08615388B2

    公开(公告)日:2013-12-24

    申请号:US12058328

    申请日:2008-03-28

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2818 G06F17/2827

    摘要: Training data may be provided, the training data including pairs of source phrases and target phrases. The pairs may be used to train an intra-language statistical machine translation model, where the intra-language statistical machine translation model, when given an input phrase of text in the human language, can compute probabilities of semantic equivalence of the input phrase to possible translations of the input phrase in the human language. The statistical machine translation model may be used to translate between queries and listings. The queries may be text strings in the human language submitted to a search engine. The listing strings may be text strings of formal names of real world entities that are to be searched by the search engine to find matches for the query strings.

    摘要翻译: 可以提供训练数据,训练数据包括源短语和目标短语对。 这些对可以用于训练语言间统计机器翻译模型,其中语言内统计机器翻译模型在给予人类语言的文本的输入短语时可以计算输入短语的语义等同性的可能性 输入短语在人类语言中的翻译。 统计机器翻译模型可用于在查询和列表之间进行翻译。 查询可以是提交给搜索引擎的人类语言中的文本字符串。 列表字符串可以是要由搜索引擎搜索以查找查询字符串的匹配的真实世界实体的正式名称的文本串。

    Spelling Using a Fuzzy Pattern Search
    3.
    发明申请
    Spelling Using a Fuzzy Pattern Search 审中-公开
    拼写使用模糊模式搜索

    公开(公告)号:US20120323967A1

    公开(公告)日:2012-12-20

    申请号:US13159442

    申请日:2011-06-14

    IPC分类号: G06F17/30

    CPC分类号: G06F16/685 G06F16/93

    摘要: A multimedia system configured to receive user input in the form of a spelled character sequence is provided. In one implementation, a spell mode is initiated, and a user spells a character sequence. The multimedia system performs spelling recognition and recognizes a sequence of character representations having a possible ambiguity resulting from any user and/or system errors. The sequence of character representations with the possible ambiguity yields multiple search keys. The multimedia system performs a fuzzy pattern search by scoring each target item from a finite dataset of target items based on the multiple search keys. One or more relevant items are ranked and presented to the user for selection, each relevant item being a target item that exceeds a relevancy threshold. The user selects the indented character sequence from the one or more relevant items.

    摘要翻译: 提供了被配置为以拼写字符序列的形式接收用户输入的多媒体系统。 在一个实现中,启动拼写模式,并且用户拼写字符序列。 多媒体系统执行拼写识别并识别由任何用户和/或系统错误导致的可能的模糊性的字符表示序列。 具有可能模糊性的字符表示序列产生多个搜索关键字。 多媒体系统通过基于多个搜索关键词从目标物品的有限数据集中对每个目标物品进行评分来执行模糊模式搜索。 将一个或多个相关项目排序并呈现给用户进行选择,每个相关项目是超过相关阈值的目标项目。 用户从一个或多个相关项目中选择缩进的字符序列。

    INTRA-LANGUAGE STATISTICAL MACHINE TRANSLATION
    4.
    发明申请
    INTRA-LANGUAGE STATISTICAL MACHINE TRANSLATION 有权
    语言统计机翻译

    公开(公告)号:US20090248422A1

    公开(公告)日:2009-10-01

    申请号:US12058328

    申请日:2008-03-28

    IPC分类号: G10L11/00 G06F17/28

    CPC分类号: G06F17/2818 G06F17/2827

    摘要: Training data may be provided, the training data including pairs of source phrases and target phrases. The pairs may be used to train an intra-language statistical machine translation model, where the intra-language statistical machine translation model, when given an input phrase of text in the human language, can compute probabilities of semantic equivalence of the input phrase to possible translations of the input phrase in the human language. The statistical machine translation model may be used to translate between queries and listings. The queries may be text strings in the human language submitted to a search engine. The listing strings may be text strings of formal names of real world entities that are to be searched by the search engine to find matches for the query strings.

    摘要翻译: 可以提供训练数据,训练数据包括源短语和目标短语对。 这些对可以用于训练语言间统计机器翻译模型,其中语言内统计机器翻译模型在给予人类语言的文本的输入短语时可以计算输入短语的语义等同性的可能性 输入短语在人类语言中的翻译。 统计机器翻译模型可用于在查询和列表之间进行翻译。 查询可以是提交给搜索引擎的人类语言中的文本字符串。 列表字符串可以是要由搜索引擎搜索以查找查询字符串的匹配的真实世界实体的正式名称的文本串。

    Constructing a classifier for classifying queries
    5.
    发明授权
    Constructing a classifier for classifying queries 有权
    构造一个用于分类查询的分类器

    公开(公告)号:US08407214B2

    公开(公告)日:2013-03-26

    申请号:US12145508

    申请日:2008-06-25

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30672

    摘要: To construct a classifier, a data structure correlating queries to items identified by the queries is received, where the data structure contains initial labeled queries that have been labeled with respect to predetermined classes, and unlabeled queries that have not been labeled with respect to the predetermined classes. The data structure is used to label at least some of the unlabeled queries with respect to the predetermined classes. Queries in the data structure that have been labeled with respect to the predetermined classes are used as training data to train the classifier.

    摘要翻译: 为了构建分类器,接收将查询与由查询识别的项目相关联的数据结构,其中数据结构包含已经针对预定类别标记的初始标记查询,以及未标记关于预定类别的未标记查询 课程 该数据结构用于标记关于预定类别的至少一些未标记查询。 已经将关于预定类标记的数据结构中的查询用作训练数据以训练分类器。

    SEARCH LEXICON EXPANSION
    6.
    发明申请
    SEARCH LEXICON EXPANSION 有权
    搜索LEXICON EXPANSION

    公开(公告)号:US20120158703A1

    公开(公告)日:2012-06-21

    申请号:US12970477

    申请日:2010-12-16

    IPC分类号: G06F17/30

    摘要: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.

    摘要翻译: 公开了一种或多种技术和/或系统,用于创建用于基于搜索的语义标签中的扩展或改进的词典。 可以使用一组第一词典元素作为查询来识别一组第一文档,并且可以从该组第一文档中提取一个或多个第一文档图案。 文档模式可用于在查询日志中找到构成文档模式的一个或多个第二文档,这些文档模式与用于返回第二个文档的查询术语相关联。 可以提取和使用第二个文档的查询条款来扩展词典。 例如,词法中的元素可以基于与不同查询域的相关性来加权。

    Search lexicon expansion
    7.
    发明授权

    公开(公告)号:US09928296B2

    公开(公告)日:2018-03-27

    申请号:US12970477

    申请日:2010-12-16

    IPC分类号: G06F17/30 G06F17/27

    摘要: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.

    Generating implicit labels and training a tagging model using such labels
    8.
    发明授权
    Generating implicit labels and training a tagging model using such labels 有权
    生成隐式标签并使用这些标签来训练标记模型

    公开(公告)号:US08250015B2

    公开(公告)日:2012-08-21

    申请号:US12419336

    申请日:2009-04-07

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F17/00 G06N5/00

    CPC分类号: G06K9/6296

    摘要: A training module is described for training a conditional random field (CRF) tagging model. The training module trains the tagging model based on an explicitly-labeled training set and an implicitly-labeled training set. The explicitly-labeled training set includes explicit labels that are manually selected via human annotation, while the implicitly-labeled training set includes implicit labels that are generated in an unsupervised manner. In one approach, the training module can train the tagging model by treating the implicit labels as non-binding evidence that has a bearing on values of hidden state sequence variables. In another approach, the training module can treat the implicit labels as binding or hard evidence. A labeling system is also described for providing the implicit labels.

    摘要翻译: 描述训练模块用于训练条件随机场(CRF)标记模型。 培训模块基于明确标记的训练集和隐式标记的训练集来训练标记模型。 明确标记的训练集包括通过人工注释手动选择的显式标签,而隐含标记的训练集包括以无监督方式生成的隐式标签。 在一种方法中,训练模块可以通过将隐式标签视为与隐含状态序列变量的值有关的非约束证据来训练标记模型。 在另一种方法中,培训模块可以将隐含的标签视为具有约束力或硬性证据。 还描述了用于提供隐式标签的标签系统。

    CONSTRUCTING A CLASSIFIER FOR CLASSIFYING QUERIES
    9.
    发明申请
    CONSTRUCTING A CLASSIFIER FOR CLASSIFYING QUERIES 有权
    构建用于分类查询的分类器

    公开(公告)号:US20090327260A1

    公开(公告)日:2009-12-31

    申请号:US12145508

    申请日:2008-06-25

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F7/06 G06F3/048 G06F17/30

    CPC分类号: G06F17/30672

    摘要: To construct a classifier, a data structure correlating queries to items identified by the queries is received, where the data structure contains initial labeled queries that have been labeled with respect to predetermined classes, and unlabeled queries that have not been labeled with respect to the predetermined classes. The data structure is used to label at least some of the unlabeled queries with respect to the predetermined classes. Queries in the data structure that have been labeled with respect to the predetermined classes are used as training data to train the classifier.

    摘要翻译: 为了构建分类器,接收将查询与由查询识别的项目相关联的数据结构,其中数据结构包含已经针对预定类别标记的初始标记查询,以及未标记关于预定类别的未标记查询 课程 该数据结构用于标记关于预定类别的至少一些未标记查询。 已经将关于预定类标记的数据结构中的查询用作训练数据以训练分类器。

    ACQUISITION OF SEMANTIC CLASS LEXICONS FOR QUERY TAGGING
    10.
    发明申请
    ACQUISITION OF SEMANTIC CLASS LEXICONS FOR QUERY TAGGING 有权
    收集用于查询标记的语义类别列表

    公开(公告)号:US20100268725A1

    公开(公告)日:2010-10-21

    申请号:US12426370

    申请日:2009-04-20

    IPC分类号: G06F17/30 G06F17/27

    摘要: A user's search experience may be enhanced by providing additional content based upon an understanding of the user's intent. Query tagging, the assigning of semantic labels to terms within a query, is one technique that may be utilized to determine the context of a user's search query. Accordingly, as provided herein, a query tagging model may be updated using one or more stratified lexicons. A list data structure (e.g., lists of phrases obtained from web pages) and seed distribution data (e.g., pre-labeled probability data) may be used by a graph learning technique to obtain an expanded set of phrases and their respective probabilities of corresponding with particular lexicons (e.g., semantic class lexicons). The expanded set of phrases may be used to group phrases into stratified lexicons. The stratified lexicons may be used as features for updating and/or executing the query tagging model.

    摘要翻译: 可以通过基于对用户意图的理解来提供附加内容来增强用户的搜索体验。 查询标记,将语义标签分配给查询中的术语,是可用于确定用户搜索查询的上下文的一种技术。 因此,如本文所提供的,可以使用一个或多个分层词典来更新查询标签模型。 列表数据结构(例如,从网页获得的短语的列表)和种子分布数据(例如,预先标记的概率数据)可以通过图形学习技术来使用以获得扩展的短语集合及其相应的概率 特定词典(例如语义类词典)。 扩展的短语组可用于将短语分组成分层词典。 分层词汇可以用作更新和/或执行查询标签模型的特征。