-
公开(公告)号:US20100318540A1
公开(公告)日:2010-12-16
申请号:US12484256
申请日:2009-06-15
IPC分类号: G06F17/30
CPC分类号: G06F17/30864 , G06K9/6257
摘要: Described is a technology for identifying sample data items (e.g., documents corresponding to query-URL pairs) having the greatest likelihood of being mislabeled when previously judged, and selecting those data items for re-judging. In one aspect, lambda gradient scores (information associated with ranked sample data items that indicates a relative direction and how “strongly” to move each data item for lowering a ranking cost) are summed for pairs of sample data items to compute re-judgment scores for each of those sample data items. The re-judgment scores indicate a relative likelihood of mislabeling. Once the selected sample data items are re-judged, a new training set is available, whereby a new ranker may be trained.
摘要翻译: 描述了用于在先前判断时识别具有错误标记的最大可能性的样本数据项(例如,与查询 - URL对相对应的文档)的技术,并且选择那些用于重新判断的数据项。 在一个方面,λ梯度分数(与指示相对方向的排名样本数据项相关联的信息以及如何“强烈地”移动每个数据项以降低排名成本)被求和于样本数据项对以计算重新判断分数 对于每个样本数据项。 重新判断分数表明错误标签的相对可能性。 一旦选择的样本数据项被重新判断,就可以使用新的训练集,由此可以训练新的训练者。
-
公开(公告)号:US08935258B2
公开(公告)日:2015-01-13
申请号:US12484256
申请日:2009-06-15
CPC分类号: G06F17/30864 , G06K9/6257
摘要: Described is a technology for identifying sample data items (e.g., documents corresponding to query-URL pairs) having the greatest likelihood of being mislabeled when previously judged, and selecting those data items for re-judging. In one aspect, lambda gradient scores (information associated with ranked sample data items that indicates a relative direction and how “strongly” to move each data item for lowering a ranking cost) are summed for pairs of sample data items to compute re-judgment scores for each of those sample data items. The re-judgment scores indicate a relative likelihood of mislabeling. Once the selected sample data items are re-judged, a new training set is available, whereby a new ranker may be trained.
摘要翻译: 描述了用于在先前判断时识别具有错误标记的最大可能性的样本数据项(例如,与查询 - URL对相对应的文档)的技术,并且选择那些用于重新判断的数据项。 在一个方面,λ梯度分数(与指示相对方向的排名样本数据项相关联的信息以及如何“强烈地”移动每个数据项以降低排名成本)被求和于样本数据项对以计算重新判断分数 对于每个样本数据项。 重新判断分数表明错误标签的相对可能性。 一旦选择的样本数据项被重新判断,就可以使用新的训练集,由此可以训练新的训练者。
-
公开(公告)号:US20080177717A1
公开(公告)日:2008-07-24
申请号:US11625076
申请日:2007-01-19
申请人: Girish Kumar , Bhuvan Middha , Gaurav Sareen , Janine Crumb , Jason Douglas , Silviu Petru Cucerzan
发明人: Girish Kumar , Bhuvan Middha , Gaurav Sareen , Janine Crumb , Jason Douglas , Silviu Petru Cucerzan
IPC分类号: G06F7/00
CPC分类号: G06F17/3064 , Y10S707/99933
摘要: Computerized methods and systems for generating a suggested query list with suggested search terms displayed as highlighted text utilizing a user-defined query are provided. Query search terms are received by a user-interface display. Upon inputting query search terms, the user-interface automatically generates a suggested query list. The suggested query list is associated with the query search term and the suggested query list is comprised of at least one suggested search term. A query suggestion architecture determines if the query search term and the suggested search term are a match, and if so, highlights the suggested search term that is not a match. The user interface displays the highlighted terms to assist in refining a search. The present invention further provides a stemming algorithm that extracts the root form of the query search term.
摘要翻译: 提供了用于生成建议的查询列表的计算机化方法和系统,其中建议的搜索项使用用户定义的查询显示为突出显示的文本。 查询搜索项由用户界面显示接收。 输入查询搜索项后,用户界面自动生成建议的查询列表。 建议的查询列表与查询搜索项相关联,并且建议的查询列表由至少一个建议的搜索项组成。 查询建议体系结构确定查询搜索词和建议的搜索词是否匹配,如果是,则突出显示不符合的建议搜索词。 用户界面显示突出显示的术语,以帮助改进搜索。 本发明还提供一种提取查询搜索项的根形式的词干化算法。
-
公开(公告)号:US20120215774A1
公开(公告)日:2012-08-23
申请号:US13030541
申请日:2011-02-18
申请人: THOMAS WILLIAM FINLEY , HERBERT DE MELO DUARTE , BHUVAN MIDDHA , DEHU QI , TANTON HOLT GIBBS , SAMBAVI MUTHUKRISHNAN
发明人: THOMAS WILLIAM FINLEY , HERBERT DE MELO DUARTE , BHUVAN MIDDHA , DEHU QI , TANTON HOLT GIBBS , SAMBAVI MUTHUKRISHNAN
IPC分类号: G06F17/30
CPC分类号: G01P15/034 , G01P15/0802 , G06F17/30663 , G06F17/30864
摘要: Methods, systems, and computer-readable media for a method of propagating signals across a web graph. A signal describes a document or otherwise provides useful information about a document in a web graph. A web graph is a collection of documents that are related to one another through links, such as hyperlinks. The signals are propagated in the sense that information from the related pages is associated with the target page even though the information may not be directly found in the target page. This information may then be used by a search engine to determine that a particular page is relevant to a search query.
摘要翻译: 用于通过网络图形传播信号的方法的方法,系统和计算机可读介质。 信号描述文档或以其他方式提供有关Web图表中的文档的有用信息。 网页图是通过链接(例如超链接)彼此相关的文档的集合。 这些信号在相关页面的信息与目标页面相关联的意义上传播,即使信息可能不会直接在目标页面中找到。 然后可以由搜索引擎使用该信息来确定特定页面与搜索查询相关。
-
公开(公告)号:US07680778B2
公开(公告)日:2010-03-16
申请号:US11625076
申请日:2007-01-19
申请人: Bhuvan Middha , Girish Kumar , Gaurav Sareen , Janine Crumb , Jason Douglas , Silviu Petru Cucerzan
发明人: Bhuvan Middha , Girish Kumar , Gaurav Sareen , Janine Crumb , Jason Douglas , Silviu Petru Cucerzan
IPC分类号: G06F7/00
CPC分类号: G06F17/3064 , Y10S707/99933
摘要: Computerized methods and systems for generating a suggested query list with suggested search terms displayed as highlighted text utilizing a user-defined query are provided. Query search terms are received by a user-interface display. Upon inputting query search terms, the user-interface automatically generates a suggested query list. The suggested query list is associated with the query search term and the suggested query list is comprised of at least one suggested search term. A query suggestion architecture determines if the query search term and the suggested search term are a match, and if so, highlights the suggested search term that is not a match. The user interface displays the highlighted terms to assist in refining a search. The present invention further provides a stemming algorithm that extracts the root form of the query search term.
摘要翻译: 提供了用于生成建议的查询列表的计算机化方法和系统,其中建议的搜索项使用用户定义的查询显示为突出显示的文本。 查询搜索项由用户界面显示接收。 输入查询搜索项后,用户界面自动生成建议的查询列表。 建议的查询列表与查询搜索项相关联,并且建议的查询列表由至少一个建议的搜索项组成。 查询建议体系结构确定查询搜索词和建议的搜索词是否匹配,如果是,则突出显示不符合的建议搜索词。 用户界面显示突出显示的术语,以帮助改进搜索。 本发明还提供一种提取查询搜索项的根形式的词干化算法。
-
公开(公告)号:US08880517B2
公开(公告)日:2014-11-04
申请号:US13030541
申请日:2011-02-18
申请人: Thomas William Finley , Herbert De Melo Duarte , Bhuvan Middha , Dehu Qi , Tanton Holt Gibbs , Sambavi Muthukrishnan
发明人: Thomas William Finley , Herbert De Melo Duarte , Bhuvan Middha , Dehu Qi , Tanton Holt Gibbs , Sambavi Muthukrishnan
IPC分类号: G06F17/30
CPC分类号: G01P15/034 , G01P15/0802 , G06F17/30663 , G06F17/30864
摘要: Methods, systems, and computer-readable media for a method of propagating signals across a web graph. A signal describes a document or otherwise provides useful information about a document in a web graph. A web graph is a collection of documents that are related to one another through links, such as hyperlinks. The signals are propagated in the sense that information from the related pages is associated with the target page even though the information may not be directly found in the target page. This information may then be used by a search engine to determine that a particular page is relevant to a search query.
摘要翻译: 用于通过网络图形传播信号的方法的方法,系统和计算机可读介质。 信号描述文档或以其他方式提供有关Web图表中的文档的有用信息。 网页图是通过链接(例如超链接)彼此相关的文档的集合。 这些信号在相关页面的信息与目标页面相关联的意义上传播,即使信息可能不会直接在目标页面中找到。 然后可以由搜索引擎使用该信息来确定特定页面与搜索查询相关。
-
7.
公开(公告)号:US20130238608A1
公开(公告)日:2013-09-12
申请号:US13413651
申请日:2012-03-07
IPC分类号: G06F17/30
CPC分类号: G06F16/334
摘要: Architecture that generates signals/features that capture the match between intent of a query and category of documents. For example, for a query intent related to “autos”, documents that belong to categories related to “Autos” receive a higher score than documents of a “computers” category. The architecture can be applied to a search ecosystem where query intent classification and document category classifier are available, learns the mapping between query intent and document category, and introduces category-match features to a ranking algorithm, thereby improving search result relevance. The architecture learns the mapping between two existing and different taxonomies to create a category match signal from which the ranking algorithm can learn. Moreover, architecture adapts to a complex ecosystem where different taxonomies on the query side and document side exist through learning a mapping score between at least two taxonomies.
摘要翻译: 生成能够捕获查询意图和文档类别之间匹配的信号/特征的体系结构。 例如,对于与“autos”相关的查询意图,属于“Autos”相关类别的文档比“计算机”类别的文档获得更高的分数。 该架构可以应用于查询意图分类和文档类别分类器可用的搜索生态系统,学习查询意图与文档类别之间的映射,并将类别匹配特征引入排序算法,从而提高搜索结果的相关性。 该架构学习了两个现有和不同分类法之间的映射,以创建一个类别匹配信号,排序算法可从该类别匹配信号学习。 此外,架构适应复杂的生态系统,通过学习至少两个分类法之间的映射分数,可以在查询方和文档方面存在不同的分类。
-
-
-
-
-
-