ACQUISITION OF SEMANTIC CLASS LEXICONS FOR QUERY TAGGING
    1.
    发明申请
    ACQUISITION OF SEMANTIC CLASS LEXICONS FOR QUERY TAGGING 有权
    收集用于查询标记的语义类别列表

    公开(公告)号:US20100268725A1

    公开(公告)日:2010-10-21

    申请号:US12426370

    申请日:2009-04-20

    IPC分类号: G06F17/30 G06F17/27

    摘要: A user's search experience may be enhanced by providing additional content based upon an understanding of the user's intent. Query tagging, the assigning of semantic labels to terms within a query, is one technique that may be utilized to determine the context of a user's search query. Accordingly, as provided herein, a query tagging model may be updated using one or more stratified lexicons. A list data structure (e.g., lists of phrases obtained from web pages) and seed distribution data (e.g., pre-labeled probability data) may be used by a graph learning technique to obtain an expanded set of phrases and their respective probabilities of corresponding with particular lexicons (e.g., semantic class lexicons). The expanded set of phrases may be used to group phrases into stratified lexicons. The stratified lexicons may be used as features for updating and/or executing the query tagging model.

    摘要翻译: 可以通过基于对用户意图的理解来提供附加内容来增强用户的搜索体验。 查询标记,将语义标签分配给查询中的术语,是可用于确定用户搜索查询的上下文的一种技术。 因此,如本文所提供的,可以使用一个或多个分层词典来更新查询标签模型。 列表数据结构(例如,从网页获得的短语的列表)和种子分布数据(例如,预先标记的概率数据)可以通过图形学习技术来使用以获得扩展的短语集合及其相应的概率 特定词典(例如语义类词典)。 扩展的短语组可用于将短语分组成分层词典。 分层词汇可以用作更新和/或执行查询标签模型的特征。

    Constructing a classifier for classifying queries
    2.
    发明授权
    Constructing a classifier for classifying queries 有权
    构造一个用于分类查询的分类器

    公开(公告)号:US08407214B2

    公开(公告)日:2013-03-26

    申请号:US12145508

    申请日:2008-06-25

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30672

    摘要: To construct a classifier, a data structure correlating queries to items identified by the queries is received, where the data structure contains initial labeled queries that have been labeled with respect to predetermined classes, and unlabeled queries that have not been labeled with respect to the predetermined classes. The data structure is used to label at least some of the unlabeled queries with respect to the predetermined classes. Queries in the data structure that have been labeled with respect to the predetermined classes are used as training data to train the classifier.

    摘要翻译: 为了构建分类器,接收将查询与由查询识别的项目相关联的数据结构,其中数据结构包含已经针对预定类别标记的初始标记查询,以及未标记关于预定类别的未标记查询 课程 该数据结构用于标记关于预定类别的至少一些未标记查询。 已经将关于预定类标记的数据结构中的查询用作训练数据以训练分类器。

    SEARCH LEXICON EXPANSION
    3.
    发明申请
    SEARCH LEXICON EXPANSION 有权
    搜索LEXICON EXPANSION

    公开(公告)号:US20120158703A1

    公开(公告)日:2012-06-21

    申请号:US12970477

    申请日:2010-12-16

    IPC分类号: G06F17/30

    摘要: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.

    摘要翻译: 公开了一种或多种技术和/或系统,用于创建用于基于搜索的语义标签中的扩展或改进的词典。 可以使用一组第一词典元素作为查询来识别一组第一文档,并且可以从该组第一文档中提取一个或多个第一文档图案。 文档模式可用于在查询日志中找到构成文档模式的一个或多个第二文档,这些文档模式与用于返回第二个文档的查询术语相关联。 可以提取和使用第二个文档的查询条款来扩展词典。 例如,词法中的元素可以基于与不同查询域的相关性来加权。

    Search lexicon expansion
    4.
    发明授权

    公开(公告)号:US09928296B2

    公开(公告)日:2018-03-27

    申请号:US12970477

    申请日:2010-12-16

    IPC分类号: G06F17/30 G06F17/27

    摘要: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.

    Generating implicit labels and training a tagging model using such labels
    5.
    发明授权
    Generating implicit labels and training a tagging model using such labels 有权
    生成隐式标签并使用这些标签来训练标记模型

    公开(公告)号:US08250015B2

    公开(公告)日:2012-08-21

    申请号:US12419336

    申请日:2009-04-07

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F17/00 G06N5/00

    CPC分类号: G06K9/6296

    摘要: A training module is described for training a conditional random field (CRF) tagging model. The training module trains the tagging model based on an explicitly-labeled training set and an implicitly-labeled training set. The explicitly-labeled training set includes explicit labels that are manually selected via human annotation, while the implicitly-labeled training set includes implicit labels that are generated in an unsupervised manner. In one approach, the training module can train the tagging model by treating the implicit labels as non-binding evidence that has a bearing on values of hidden state sequence variables. In another approach, the training module can treat the implicit labels as binding or hard evidence. A labeling system is also described for providing the implicit labels.

    摘要翻译: 描述训练模块用于训练条件随机场(CRF)标记模型。 培训模块基于明确标记的训练集和隐式标记的训练集来训练标记模型。 明确标记的训练集包括通过人工注释手动选择的显式标签,而隐含标记的训练集包括以无监督方式生成的隐式标签。 在一种方法中,训练模块可以通过将隐式标签视为与隐含状态序列变量的值有关的非约束证据来训练标记模型。 在另一种方法中,培训模块可以将隐含的标签视为具有约束力或硬性证据。 还描述了用于提供隐式标签的标签系统。

    TEMPLATE CONCATENATION FOR CAPTURING MULTIPLE CONCEPTS IN A VOICE QUERY
    6.
    发明申请
    TEMPLATE CONCATENATION FOR CAPTURING MULTIPLE CONCEPTS IN A VOICE QUERY 审中-公开
    用于在语音查询中捕获多种概念的模板化

    公开(公告)号:US20110314003A1

    公开(公告)日:2011-12-22

    申请号:US12817233

    申请日:2010-06-17

    IPC分类号: G06F17/30

    摘要: Architecture that provides the capability to identify which parts (terms and phrases) of a voice query have been covered by predefined phrase templates, and then to concatenate matching phrase templates into a new paraphrased query. A match-drop-continue algorithm is disclosed that progressively masks out the portions (phrases, terms) of the query matched to the phrase templates. Ultimately, the matched phrase templates are accumulated and organized together dynamically into a rephrased version of the original voice query. A user interface is provided that allows the user to confirm/summarize the multiple concepts in a progressive manner.

    摘要翻译: 提供识别语音查询的哪些部分(术语和短语)已经被预定义的短语模板覆盖的能力的架构,然后将匹配短语模板连接成新的释义查询。 公开了逐行掩蔽与短语模板匹配的查询的部分(短语,术语)的匹配 - 丢弃 - 继续算法。 最终,匹配的短语模板被积累并且一起动态地组织成原始语音查询的转换版本。 提供了一种用户界面,允许用户以渐进的方式确认/总结多个概念。

    CONSTRUCTING A CLASSIFIER FOR CLASSIFYING QUERIES
    7.
    发明申请
    CONSTRUCTING A CLASSIFIER FOR CLASSIFYING QUERIES 有权
    构建用于分类查询的分类器

    公开(公告)号:US20090327260A1

    公开(公告)日:2009-12-31

    申请号:US12145508

    申请日:2008-06-25

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F7/06 G06F3/048 G06F17/30

    CPC分类号: G06F17/30672

    摘要: To construct a classifier, a data structure correlating queries to items identified by the queries is received, where the data structure contains initial labeled queries that have been labeled with respect to predetermined classes, and unlabeled queries that have not been labeled with respect to the predetermined classes. The data structure is used to label at least some of the unlabeled queries with respect to the predetermined classes. Queries in the data structure that have been labeled with respect to the predetermined classes are used as training data to train the classifier.

    摘要翻译: 为了构建分类器,接收将查询与由查询识别的项目相关联的数据结构,其中数据结构包含已经针对预定类别标记的初始标记查询,以及未标记关于预定类别的未标记查询 课程 该数据结构用于标记关于预定类别的至少一些未标记查询。 已经将关于预定类标记的数据结构中的查询用作训练数据以训练分类器。

    GENERATING IMPLICIT LABELS AND TRAINING A TAGGING MODEL USING SUCH LABELS
    8.
    发明申请
    GENERATING IMPLICIT LABELS AND TRAINING A TAGGING MODEL USING SUCH LABELS 有权
    生成隐式标签并使用这样的标签来训练标签模型

    公开(公告)号:US20100256969A1

    公开(公告)日:2010-10-07

    申请号:US12419336

    申请日:2009-04-07

    申请人: Xiao Li Ye-Yi Wang

    发明人: Xiao Li Ye-Yi Wang

    IPC分类号: G06F17/50

    CPC分类号: G06K9/6296

    摘要: A training module is described for training a conditional random field (CRF) tagging model. The training module trains the tagging model based on an explicitly-labeled training set and an implicitly-labeled training set. The explicitly-labeled training set includes explicit labels that are manually selected via human annotation, while the implicitly-labeled training set includes implicit labels that are generated in an unsupervised manner. In one approach, the training module can train the tagging model by treating the implicit labels as non-binding evidence that has a bearing on values of hidden state sequence variables. In another approach, the training module can treat the implicit labels as binding or hard evidence. A labeling system is also described for providing the implicit labels.

    摘要翻译: 描述训练模块用于训练条件随机场(CRF)标记模型。 培训模块基于明确标记的训练集和隐式标记的训练集来训练标记模型。 明确标记的训练集包括通过人工注释手动选择的显式标签,而隐含标记的训练集包括以无监督方式生成的隐式标签。 在一种方法中,训练模块可以通过将隐式标签视为与隐含状态序列变量的值有关的非约束证据来训练标记模型。 在另一种方法中,培训模块可以将隐含的标签视为具有约束力或硬性证据。 还描述了用于提供隐式标签的标签系统。

    Acquisition of semantic class lexicons for query tagging
    9.
    发明授权
    Acquisition of semantic class lexicons for query tagging 有权
    获取用于查询标记的语义类词典

    公开(公告)号:US09336299B2

    公开(公告)日:2016-05-10

    申请号:US12426370

    申请日:2009-04-20

    IPC分类号: G06F17/30 G06F17/27

    摘要: A user's search experience may be enhanced by providing additional content based upon an understanding of the user's intent. Query tagging, the assigning of semantic labels to terms within a query, is one technique that may be utilized to determine the context of a user's search query. Accordingly, as provided herein, a query tagging model may be updated using one or more stratified lexicons. A list data structure (e.g., lists of phrases obtained from web pages) and seed distribution data (e.g., pre-labeled probability data) may be used by a graph learning technique to obtain an expanded set of phrases and their respective probabilities of corresponding with particular lexicons (e.g., semantic class lexicons). The expanded set of phrases may be used to group phrases into stratified lexicons. The stratified lexicons may be used as features for updating and/or executing the query tagging model.

    摘要翻译: 可以通过基于对用户意图的理解来提供附加内容来增强用户的搜索体验。 查询标记,将语义标签分配给查询中的术语,是可用于确定用户搜索查询的上下文的一种技术。 因此,如本文所提供的,可以使用一个或多个分层词典来更新查询标签模型。 列表数据结构(例如,从网页获得的短语的列表)和种子分布数据(例如,预先标记的概率数据)可以通过图形学习技术来使用以获得扩展的短语集合及其相应的概率 特定词典(例如语义类词典)。 扩展的短语组可用于将短语分组成分层词典。 分层词汇可以用作更新和/或执行查询标签模型的特征。