System and method for providing interactive feature selection for training a document classification system
    1.
    发明申请
    System and method for providing interactive feature selection for training a document classification system 审中-公开
    用于提供用于训练文档分类系统的交互式特征选择的系统和方法

    公开(公告)号:US20060212142A1

    公开(公告)日:2006-09-21

    申请号:US11376989

    申请日:2006-03-15

    IPC分类号: G05B13/02

    摘要: A method for facilitating development of a document classification function comprises selecting a feature of a document, the feature being less than an entirety of the document; presenting the feature to a human subject; asking the human subject for a feature relevance value of the feature; and generating a classification function using the feature relevance value. The method may also include the steps of presenting the document to the human subject at the same time as presenting the feature; asking the human subject for document relevance value that measures relevance of the document to a category; and wherein the generating the classification function also uses the document relevance value.

    摘要翻译: 一种便于开发文档分类功能的方法包括选择文档的特征,该特征小于该文档的整体; 将特征呈现给人类主体; 向人类主体询问该特征的特征相关性值; 以及使用所述特征相关性值来生成分类函数。 该方法还可以包括以下步骤:在呈现特征的同时将文档呈现给人类对象; 向人类主体询问衡量文件对某一类别的相关性的文件相关性价值; 并且其中生成所述分类功能也使用所述文档相关性值。

    System and Method for Automatically Detecting and Interactively Displaying Information About Entities, Activities, and Events from Multiple-Modality Natural Language Sources
    2.
    发明申请
    System and Method for Automatically Detecting and Interactively Displaying Information About Entities, Activities, and Events from Multiple-Modality Natural Language Sources 审中-公开
    自动检测和交互式显示多模式自然语言源的实体,活动和事件信息的系统和方法

    公开(公告)号:US20130332450A1

    公开(公告)日:2013-12-12

    申请号:US13493659

    申请日:2012-06-11

    IPC分类号: G06F17/30 G06F17/28

    摘要: A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Identified information is grouped into equivalence classes containing equivalent information. At least one displayable representation of the equivalence classes is created. An order in which the at least one displayable representation is displayed is computed. A combined representation of the equivalence classes that respects the order in which the displayable representation is displayed is produced.

    摘要翻译: 提供了一种通过处理装置从多个数据源自动提取和组织信息的方法。 包括实体的自动检测的自然语言处理信息提取流水线被应用于数据源。 通过分析自然语言处理流水线的产品来识别检测到的实体信息。 识别的信息被分为包含等效信息的等价类。 创建等价类的至少一个可显示的表示形式。 计算显示至少一个可显示表示的顺序。 产生了相当于显示可显示表示的顺序的等价类的组合表示。

    Ad Relevance In Sponsored Search
    3.
    发明申请
    Ad Relevance In Sponsored Search 审中-公开
    广告相关性在赞助搜索

    公开(公告)号:US20110270672A1

    公开(公告)日:2011-11-03

    申请号:US12769446

    申请日:2010-04-28

    IPC分类号: G06Q30/00 G06F15/18

    摘要: Techniques for improving advertisement relevance for sponsored search advertising. The method includes steps for processing a click history data structure containing at least a plurality of query-advertisement pairs, populating a first translation table containing a co-occurrence count field, populating a second translation table containing an expected clicks field, and calculating a click propensity score for an advertisement using the click history data structure, the first translation table (for determining overall click likelihood across all historical traffic), and using the second translation table (for removing biases present in the first translation table). Other method steps calculate a second click propensity score for a second advertisement, then ranking the first advertisement relative to the second advertisement for comparing a click propensity score to a threshold for filtering low quality ad candidates from a plurality of ad candidates, and then ranking advertisements for optimizing placement of ads on a sponsored search display page.

    摘要翻译: 用于提高赞助搜索广告广告相关性的技术。 该方法包括处理包含至少多个查询 - 广告对的点击历史数据结构的步骤,填充包含同现计数字段的第一翻译表,填充包含预期点击字段的第二翻译表,以及计算点击 用于使用点击历史数据结构的广告的倾向得分,第一翻译表(用于确定所有历史流量中的整体点击可能性)以及使用第二转换表(用于去除存在于第一翻译表中的偏差)。 其他方法步骤计算第二广告的第二点击倾向得分,然后相对于第二广告对第一广告进行排名,用于将点击倾向得分与用于从多个广告候选中过滤低质量广告候选的阈值进行比较,然后排列广告 用于优化广告在赞助的搜索显示页面上的展示位置。

    SYSTEM AND METHOD TO IDENTIFY CONTEXT-DEPENDENT TERM IMPORTANCE OF QUERIES FOR PREDICTING RELEVANT SEARCH ADVERTISEMENTS
    5.
    发明申请
    SYSTEM AND METHOD TO IDENTIFY CONTEXT-DEPENDENT TERM IMPORTANCE OF QUERIES FOR PREDICTING RELEVANT SEARCH ADVERTISEMENTS 审中-公开
    识别相关相关重要因素的系统和方法用于预测相关搜索广告

    公开(公告)号:US20110131205A1

    公开(公告)日:2011-06-02

    申请号:US12626894

    申请日:2009-11-28

    IPC分类号: G06F17/30

    CPC分类号: G06F16/3334

    摘要: An improved system and method for identifying context-dependent term importance of queries is provided. A query term importance model is learned using supervised learning of context-dependent term importance for queries and is then applied for advertisement prediction using term importance weights of query terms as query features. For instance, a query term importance model for query rewriting may predict rewritten queries that match a query with term importance weights assigned as query features. Or a query term importance model for advertisement prediction may predict relevant advertisements for a query with term importance weights assigned as query features. In an embodiment, a sponsored advertisement selection engine selects sponsored advertisements scored by a query term importance engine that applies a query term importance model using term importance weights as query features and inverse document frequency weights as advertisement features to assign a relevance score.

    摘要翻译: 提供了一种用于识别查询的上下文相关项重要性的改进的系统和方法。 使用对查询的上下文相关项重要性的监督学习来学习查询词重要性模型,然后将其用作查询词语的重要度权重作为查询特征应用于广告预测。 例如,用于查询重写的查询项重要性模型可以预测与查询匹配的重写查询与被分配为查询特征的术语重要性权重。 或者用于广告预测的查询词重要性模型可以预测具有被指定为查询特征的术语重要性权重的查询的相关广告。 在一个实施例中,赞助的广告选择引擎选择由查询词语重要性引擎评分的赞助广告,该查询词语重要性引擎使用术语重要性权重作为查询特征和逆文档频率权重作为广告特征来分配相关性得分。

    SYSTEM AND METHOD FOR PREDICTING CONTEXT-DEPENDENT TERM IMPORTANCE OF SEARCH QUERIES
    6.
    发明申请
    SYSTEM AND METHOD FOR PREDICTING CONTEXT-DEPENDENT TERM IMPORTANCE OF SEARCH QUERIES 审中-公开
    用于预测搜索查询的背景相关重要性的系统和方法

    公开(公告)号:US20110131157A1

    公开(公告)日:2011-06-02

    申请号:US12626892

    申请日:2009-11-28

    IPC分类号: G06F17/30 G06F15/18

    CPC分类号: G06Q30/0251

    摘要: An improved system and method for identifying context-dependent term importance of queries is provided. A query term importance model is learned using supervised learning of context-dependent term importance for queries and is then applied for advertisement prediction using term importance weights of query terms as query features. For instance, a query term importance model for query rewriting may predict rewritten queries that match a query with term importance weights assigned as query features. Or a query term importance model for advertisement prediction may predict relevant advertisements for a query with term importance weights assigned as query features. In an embodiment, a sponsored advertisement selection engine selects sponsored advertisements scored by a query term importance engine that applies a query term importance model using term importance weights as query features and inverse document frequency weights as advertisement features to assign a relevance score.

    摘要翻译: 提供了一种用于识别查询的上下文相关项重要性的改进的系统和方法。 使用对查询的上下文相关项重要性的监督学习来学习查询词重要性模型,然后将其用作查询词语的重要度权重作为查询特征应用于广告预测。 例如,用于查询重写的查询项重要性模型可以预测与查询匹配的重写查询与被分配为查询特征的术语重要性权重。 或者用于广告预测的查询词重要性模型可以预测具有被指定为查询特征的术语重要性权重的查询的相关广告。 在一个实施例中,赞助的广告选择引擎选择由查询词语重要性引擎评分的赞助广告,该查询词语重要性引擎使用术语重要性权重作为查询特征和逆文档频率权重作为广告特征来分配相关性得分。