RANKING SEARCH RESULTS USING AUTHOR EXTRACTION
    2.
    发明申请
    RANKING SEARCH RESULTS USING AUTHOR EXTRACTION 审中-公开
    使用作者提取排名搜索结果

    公开(公告)号:US20090182723A1

    公开(公告)日:2009-07-16

    申请号:US11972613

    申请日:2008-01-10

    IPC分类号: G06F17/30

    CPC分类号: G06F16/38

    摘要: Architecture that extracts author information from general documents and uses the author information for search results ranking. The architecture performs automatic author value extraction and makes the extracted value available at index time for subsequent use at query processing and results ranking. Machine learning (e.g., a perceptron algorithm) is employed and a set of input features for the perceptron algorithm utilized for author value extraction. The extracted author value is converted into a feature for input a ranking function for generating a ranking score for each document. The input features can also be weighted according to weighting criteria.

    摘要翻译: 从一般文件中提取作者信息并使用作者信息进行搜索结果排名的架构。 该架构执行自动作者价值提取,并使提取的值在索引时间可用于随后在查询处理和结果排名中使用。 采用机器学习(例如,感知器算法)和用于感知器算法的用于作者价值提取的一组输入特征。 提取的作者价值被转换成用于输入用于生成每个文档的排名得分的排名功能的特征。 输入特征也可以根据加权标准加权。

    Enterprise Search Method and System
    4.
    发明申请
    Enterprise Search Method and System 有权
    企业搜索方法与系统

    公开(公告)号:US20100228711A1

    公开(公告)日:2010-09-09

    申请号:US12391484

    申请日:2009-02-24

    IPC分类号: G06F7/06 G06F17/30 G06F3/048

    摘要: A system and method for enterprise search includes one or more computer-readable media storing computer-executable instructions that, when executed on one or more processors that perform acts including extracting one or more of term data, personal data and metadata from one or more predetermined resources; retrieving a set of information derived from the extracted term data, personal data and metadata responsive to a query; and receiving feedback responsive to the set of information, the feedback augmenting at least one of the one or more predetermined resources.

    摘要翻译: 用于企业搜索的系统和方法包括存储计算机可执行指令的一个或多个计算机可读介质,所述计算机可执行指令当在执行动作的一个或多个处理器上执行时,包括从一个或多个预定的 资源; 从所提取的术语数据,响应于查询的个人数据和元数据检索一组信息; 以及响应于所述一组信息接收反馈,所述反馈增加所述一个或多个预定资源中的至少一个。

    Mining and Conveying Social Relationships
    5.
    发明申请
    Mining and Conveying Social Relationships 审中-公开
    挖掘和输送社会关系

    公开(公告)号:US20110078188A1

    公开(公告)日:2011-03-31

    申请号:US12568622

    申请日:2009-09-28

    IPC分类号: G06F17/30 G06F3/00

    CPC分类号: G06Q30/02 G06Q50/01

    摘要: Techniques and tools described herein mine social information from a source and store the social information in a database. Responsive to a search object, the techniques search the stored social information and determine social relationships. The techniques further provide, via a graphical user interface, the social relationships determined from the social information stored in the database. In several embodiments, the techniques enable social relationship feedback.

    摘要翻译: 本文描述的技术和工具将资源中的社会信息存储在数据库中。 响应搜索对象,该技术搜索存储的社会信息并确定社会关系。 这些技术还通过图形用户界面提供从存储在数据库中的社会信息确定的社会关系。 在几个实施例中,这些技术实现了社会关系反馈。

    Search results ranking using editing distance and document information
    6.
    发明授权
    Search results ranking using editing distance and document information 有权
    使用编辑距离和文档信息搜索结果排名

    公开(公告)号:US08812493B2

    公开(公告)日:2014-08-19

    申请号:US12101951

    申请日:2008-04-11

    IPC分类号: G06F7/00

    CPC分类号: G06F17/2211 G06F17/30864

    摘要: Architecture for extracting document information from documents received as search results based on a query string, and computing an edit distance between the data string and the query string. The edit distance is employed in determining relevance of the document as part of result ranking by detecting near-matches of a whole query or part of the query. The edit distance evaluates how close the query string is to a given data stream that includes document information such as TAUC (title, anchor text, URL, clicks) information, etc. The architecture includes the index-time splitting of compound terms in the URL to allow the more effective discovery of query terms. Additionally, index-time filtering of anchor text is utilized to find the top N anchors of one or more of the document results. The TAUC information can be input to a neural network (e.g., 2-layer) to improve relevance metrics for ranking the search results.

    摘要翻译: 用于基于查询字符串从作为搜索结果接收的文档提取文档信息的结构,以及计算数据串和查询字符串之间的编辑距离。 编辑距离用于通过检测整个查询或部分查询的近似匹配来确定文档作为结果排名的一部分的相关性。 编辑距离评估查询字符串与包含诸如TAUC(标题,锚文本,URL,点击)信息等文档信息的给定数据流的距离。该体系结构包括索引时间分割URL中的复合术语 以便更有效地发现查询条款。 另外,使用锚文本的索引时间过滤来查找一个或多个文档结果的前N个锚点。 可以将TAUC信息输入到神经网络(例如,2层),以改进用于对搜索结果排序的相关性度量。

    Ranking search results using feature extraction
    7.
    发明授权
    Ranking search results using feature extraction 失效
    使用特征提取排列搜索结果

    公开(公告)号:US07716198B2

    公开(公告)日:2010-05-11

    申请号:US11019091

    申请日:2004-12-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30684

    摘要: Methods and computer-readable media are provided for ranking search results using feature extraction data. Each of the results of a search engine query is parsed to obtain data, such as text, formatting information, metadata, and the like. The text, the formatting information and the metadata are passed through a feature extraction application to extract data that may be used to improve a ranking of the search results based on relevance of the search results to the search engine query. The feature extraction application extracts features, such as titles, found in any of the text based on formatting information applied to or associated with the text. The extracted titles, the text, the formatting information and the metadata for any given search results item are processed according to a field weighting application for determining a ranking of the given search results item. Ranked search results items may then be displayed according to ranking.

    摘要翻译: 提供方法和计算机可读介质用于使用特征提取数据对搜索结果进行排名。 解析搜索引擎查询的每个结果以获得诸如文本,格式信息,元数据等的数据。 文本,格式化信息和元数据通过特征提取应用程序传递,以提取可用于根据搜索结果与搜索引擎查询的相关性来提高搜索结果排名的数据。 特征提取应用程序基于应用于或与文本相关联的格式化信息来提取在任何文本中找到的特征,诸如标题。 根据用于确定给定搜索结果项目的排名的字段加权应用程序处理提取的标题,文本,格式化信息和用于任何给定搜索结果项目的元数据。 然后可以根据排名显示排名的搜索结果项。

    Ranking search results using feature extraction
    8.
    发明申请
    Ranking search results using feature extraction 失效
    使用特征提取排列搜索结果

    公开(公告)号:US20060136411A1

    公开(公告)日:2006-06-22

    申请号:US11019091

    申请日:2004-12-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30684

    摘要: Methods and computer-readable media are provided for ranking search results using feature extraction data. Each of the results of a search engine query is parsed to obtain data, such as text, formatting information, metadata, and the like. The text, the formatting information and the metadata are passed through a feature extraction application to extract data that may be used to improve a ranking of the search results based on relevance of the search results to the search engine query. The feature extraction application extracts features, such as titles, found in any of the text based on formatting information applied to or associated with the text. The extracted titles, the text, the formatting information and the metadata for any given search results item are processed according to a field weighting application for determining a ranking of the given search results item. Ranked search results items may then be displayed according to ranking.

    摘要翻译: 提供方法和计算机可读介质用于使用特征提取数据对搜索结果进行排名。 解析搜索引擎查询的每个结果以获得诸如文本,格式信息,元数据等的数据。 文本,格式化信息和元数据通过特征提取应用程序传递,以提取可用于根据搜索结果与搜索引擎查询的相关性来提高搜索结果排名的数据。 特征提取应用程序基于应用于或与文本相关联的格式化信息来提取在任何文本中找到的特征,诸如标题。 根据用于确定给定搜索结果项目的排名的字段加权应用程序处理提取的标题,文本,格式化信息和用于任何给定搜索结果项目的元数据。 然后可以根据排名显示排名的搜索结果项。

    SEARCH RESULTS RANKING USING EDITING DISTANCE AND DOCUMENT INFORMATION
    10.
    发明申请
    SEARCH RESULTS RANKING USING EDITING DISTANCE AND DOCUMENT INFORMATION 有权
    搜索结果使用编辑距离和文档信息排名

    公开(公告)号:US20090259651A1

    公开(公告)日:2009-10-15

    申请号:US12101951

    申请日:2008-04-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/2211 G06F17/30864

    摘要: Architecture for extracting document information from documents received as search results based on a query string, and computing an edit distance between the data string and the query string. The edit distance is employed in determining relevance of the document as part of result ranking by detecting near-matches of a whole query or part of the query. The edit distance evaluates how close the query string is to a given data stream that includes document information such as TAUC (title, anchor text, URL, clicks) information, etc. The architecture includes the index-time splitting of compound terms in the URL to allow the more effective discovery of query terms. Additionally, index-time filtering of anchor text is utilized to find the top N anchors of one or more of the document results. The TAUC information can be input to a neural network (e.g., 2-layer) to improve relevance metrics for ranking the search results.

    摘要翻译: 用于基于查询字符串从作为搜索结果接收的文档提取文档信息的结构,以及计算数据串和查询字符串之间的编辑距离。 编辑距离用于通过检测整个查询或部分查询的近似匹配来确定文档作为结果排名的一部分的相关性。 编辑距离评估查询字符串与包含诸如TAUC(标题,锚文本,URL,点击)信息等文档信息的给定数据流的距离。该体系结构包括索引时间分割URL中的复合术语 以便更有效地发现查询条款。 另外,使用锚文本的索引时间过滤来查找一个或多个文档结果的前N个锚点。 可以将TAUC信息输入到神经网络(例如,2层),以改进用于对搜索结果排序的相关性度量。