Link analysis for enterprise environment
    11.
    发明授权
    Link analysis for enterprise environment 有权
    企业环境链接分析

    公开(公告)号:US08433712B2

    公开(公告)日:2013-04-30

    申请号:US11680548

    申请日:2007-02-28

    IPC分类号: G06F17/00 G06F7/00

    CPC分类号: G06F17/30979 G06F21/41

    摘要: A flexible and extensible architecture allows for secure searching across an enterprise. Such an architecture can provide a simple Internet-like search experience to users searching secure content inside (and outside) the enterprise. The architecture allows for the crawling and searching of a variety or sources across an enterprise, regardless of whether any of these sources conform to a conventional user role model. The architecture further allows for security attributes to be submitted at query time, for example, in order to provide real-time secure access to enterprise resources. The user query also can be transformed to provide for dynamic querying that provides for a more current result list than can be obtained for static queries.

    摘要翻译: 灵活可扩展的架构允许跨企业进行安全搜索。 这样的架构可以为在企业内部(和外部)搜索安全内容的用户提供简单的类似Internet的搜索体验。 该架构允许在整个企业中爬行和搜索各种源,而不管这些源是否符合常规用户角色模型。 该体系结构进一步允许在查询时提交安全属性,例如为了提供对企业资源的实时安全访问。 用户查询也可以被转换以提供动态查询,其提供比静态查询可获得的更多当前结果列表。

    Changing ranking algorithms based on customer settings
    12.
    发明授权
    Changing ranking algorithms based on customer settings 有权
    根据客户设置改变排名算法

    公开(公告)号:US08412717B2

    公开(公告)日:2013-04-02

    申请号:US13169688

    申请日:2011-06-27

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30867 G06F17/30699

    摘要: Search term ranking algorithms can be generated and updated based on customer settings, such as where a ranking algorithm is modeled as a combination function of different ranking factors. An end user of a search system provides personalized preferences for weighted attributes, generally or for a single instance of the query. The user also can indicate the relative importance of one or more ranking factors by specifying different weights to the factors. Ranking factors can specify document attributes, such as document title, document body, document page rank, etc. Based on the attribute weights and the received user query, a ranking algorithm function will produce the relevant value for each document corresponding to the user preferences and personalization configurations.

    摘要翻译: 搜索项排序算法可以根据客户设置生成和更新,例如排序算法被建模为不同排名因素的组合函数。 搜索系统的最终用户为加权属性提供个性化偏好,一般或单个查询实例。 用户还可以通过为因素指定不同的权重来指示一个或多个排名因子的相对重要性。 排名因素可以指定文档属性,如文档标题,文档正文,文档页面排名等。基于属性权重和接收到的用户查询,排序算法函数将为每个文档生成与用户偏好相对应的相关值, 个性化配置

    Crawling secure data sources
    13.
    发明授权
    Crawling secure data sources 有权
    抓取安全数据源

    公开(公告)号:US08601028B2

    公开(公告)日:2013-12-03

    申请号:US13536488

    申请日:2012-06-28

    IPC分类号: G06F17/30

    摘要: It is desirable to provide a secure search mechanism to provide for searching over any and all content, such as across an enterprise. A secure search, however, requires access to the secure content repositories holding the data to be searched. In some cases the credentials required to crawl a repository may be extremely sensitive, or the user may be reluctant or unwilling to store user identification information in memory or on disk for any longer than is absolutely necessary. An approach is provided that allows a user or an administrator to provide security credentials to be stored and used only during a crawl, and to erase the credentials from the system when the crawl is complete.

    摘要翻译: 期望提供一种安全搜索机制来提供对任何和所有内容的搜索,诸如跨企业的搜索。 然而,安全搜索需要访问保存要搜索的数据的安全内容存储库。 在某些情况下,爬取存储库所需的凭据可能非常敏感,或者用户可能不愿意或不愿意将用户标识信息存储在内存或磁盘上,而不是绝对必要的。 提供了一种方法,允许用户或管理员提供仅在爬网期间存储和使用的安全凭据,并在抓取完成时从系统中清除凭据。

    INDEXING SECURE ENTERPRISE DOCUMENTS USING GENERIC REFERENCES
    14.
    发明申请
    INDEXING SECURE ENTERPRISE DOCUMENTS USING GENERIC REFERENCES 有权
    使用一般参考文献索引安全企业文档

    公开(公告)号:US20130173582A1

    公开(公告)日:2013-07-04

    申请号:US13539622

    申请日:2012-07-02

    IPC分类号: G06F17/30

    摘要: A web crawler indexes documents including information about document contents and metadata including information such as a URL. However, some applications rely on URL's that change frequently or are constructed to include user information so that the contents retrieved is customized to the user. An approach is provided for storing generic URL's in an index at crawl time, which are customized for the user at search time. A callback mechanism may be used to dynamically transform the generic URL into a URL that is specific to the user issuing the query and/or includes current information that may change frequently. In this way, when the query or search results are returned to the user, the user receives links that are active and valid for that particular user, directing the user to the appropriate site, application, etc. without requiring continuous updating of a very large index.

    摘要翻译: 网页抓取工具索引文档,包括有关文档内容和元数据的信息,包括诸如URL之类的信息。 然而,一些应用程序依赖于频​​繁更改的URL或被构造为包括用户信息,以便检索到的内容是为用户定制的。 提供了一种方法,用于将通用URL存储在抓取时间的索引中,这是在搜索时为用户定制的。 可以使用回调机制来动态地将通用URL变换成特定于发布查询的用户的URL和/或包括可能频繁变化的当前信息。 以这种方式,当查询或搜索结果被返回给用户时,用户接收对该特定用户有效且有效的链接,将用户指向适当的站点,应用等,而不需要持续更新非常大的 指数。

    CRAWLING SECURE DATA SOURCES
    15.
    发明申请
    CRAWLING SECURE DATA SOURCES 有权
    修复安全数据源

    公开(公告)号:US20120272304A1

    公开(公告)日:2012-10-25

    申请号:US13536488

    申请日:2012-06-28

    IPC分类号: G06F21/00

    摘要: It is desirable to provide a secure search mechanism to provide for searching over any and all content, such as across an enterprise. A secure search, however, requires access to the secure content repositories holding the data to be searched. In some cases the credentials required to crawl a repository may be extremely sensitive, or the user may be reluctant or unwilling to store user identification information in memory or on disk for any longer than is absolutely necessary. An approach is provided that allows a user or an administrator to provide security credentials to be stored and used only during a crawl, and to erase the credentials from the system when the crawl is complete.

    摘要翻译: 期望提供一种安全搜索机制来提供对任何和所有内容的搜索,诸如跨企业的搜索。 然而,安全搜索需要访问保存要搜索的数据的安全内容存储库。 在某些情况下,爬取存储库所需的凭据可能非常敏感,或者用户可能不愿意或不愿意将用户标识信息存储在内存或磁盘上,而不是绝对必要的。 提供了一种方法,允许用户或管理员提供仅在爬网期间存储和使用的安全凭据,并在抓取完成时从系统中清除凭据。

    Re-ranking search results from an enterprise system
    16.
    发明授权
    Re-ranking search results from an enterprise system 有权
    从企业系统重新排列搜索结果

    公开(公告)号:US08239414B2

    公开(公告)日:2012-08-07

    申请号:US13110461

    申请日:2011-05-18

    IPC分类号: G06F17/00

    摘要: A flexible and extensible architecture allows for secure searching across an enterprise. Such an architecture can provide a simple Internet-like search experience to users searching secure content inside (and outside) the enterprise. The architecture allows for the crawling and searching of a variety of sources across an enterprise, regardless of whether any of these sources conform to a conventional user role model. The architecture further allows for security, recency, or other attributes to be submitted at query time, for example, in order to re-rank query results from enterprise resources. The user query also can be transformed to provide for dynamic querying that provides for a more current result list than can be obtained for static queries.

    摘要翻译: 灵活可扩展的架构允许跨企业进行安全搜索。 这样的架构可以为在企业内部(和外部)搜索安全内容的用户提供简单的类似Internet的搜索体验。 该架构允许在整个企业中爬行和搜索各种源,而不管这些源是否符合常规用户角色模型。 该体系结构还允许在查询时提交安全性,新近度或其他属性,例如,以便从企业资源重新排列查询结果。 用户查询也可以被转换以提供动态查询,其提供比静态查询可获得的更多当前结果列表。

    Link Analysis for Enterprise Environment
    17.
    发明申请
    Link Analysis for Enterprise Environment 有权
    企业环境链接分析

    公开(公告)号:US20070208734A1

    公开(公告)日:2007-09-06

    申请号:US11680548

    申请日:2007-02-28

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30979 G06F21/41

    摘要: A flexible and extensible architecture allows for secure searching across an enterprise. Such an architecture can provide a simple Internet-like search experience to users searching secure content inside (and outside) the enterprise. The architecture allows for the crawling and searching of a variety or sources across an enterprise, regardless of whether any of these sources conform to a conventional user role model. The architecture further allows for security attributes to be submitted at query time, for example, in order to provide real-time secure access to enterprise resources. The user query also can be transformed to provide for dynamic querying that provides for a more current result list than can be obtained for static queries.

    摘要翻译: 灵活可扩展的架构允许跨企业进行安全搜索。 这样的架构可以为在企业内部(和外部)搜索安全内容的用户提供简单的类似Internet的搜索体验。 该架构允许在整个企业中爬行和搜索各种源,而不管这些源是否符合常规用户角色模型。 该体系结构进一步允许在查询时提交安全属性,例如为了提供对企业资源的实时安全访问。 用户查询也可以被转换以提供动态查询,其提供比静态查询可获得的更多当前结果列表。

    Extensible mechanism for detecting duplicate search items
    18.
    发明申请
    Extensible mechanism for detecting duplicate search items 有权
    用于检测重复搜索项的可扩展机制

    公开(公告)号:US20080222063A1

    公开(公告)日:2008-09-11

    申请号:US11714418

    申请日:2007-03-06

    IPC分类号: G06F15/18

    CPC分类号: H04L51/12

    摘要: Systems, methods, and other embodiments associated with identifying and selectively deleting duplicate search results are described. One example system embodiment includes logic to receive an identity indicator from a search logic. The identity indicator is associated with a search item that the search logic determines to be relevant to a search request. The example system may also include logic to determine whether the search result associated with the identity indicator is a duplicate result based on comparing the identity indicator to another identity indicator associated with another search result.

    摘要翻译: 描述与识别和选择性地删除重复搜索结果相关联的系统,方法和其他实施例。 一个示例系统实施例包括从搜索逻辑接收身份指示符的逻辑。 身份指示符与搜索项目相关联,搜索逻辑确定与搜索请求相关。 该示例系统还可以包括用于基于将身份指示符与与另一搜索结果相关联的另一身份指示符进行比较来确定与身份指示符相关联的搜索结果是否是重复结果的逻辑。

    Extensible mechanism for detecting duplicate search items
    19.
    发明授权
    Extensible mechanism for detecting duplicate search items 有权
    用于检测重复搜索项的可扩展机制

    公开(公告)号:US07756798B2

    公开(公告)日:2010-07-13

    申请号:US11714418

    申请日:2007-03-06

    IPC分类号: G06N5/00

    CPC分类号: H04L51/12

    摘要: Systems, methods, and other embodiments associated with identifying and selectively deleting duplicate search results are described. One example system embodiment includes logic to receive an identity indicator from a search logic. The identity indicator is associated with a search item that the search logic determines to be relevant to a search request. The example system may also include logic to determine whether the search result associated with the identity indicator is a duplicate result based on comparing the identity indicator to another identity indicator associated with another search result.

    摘要翻译: 描述与识别和选择性地删除重复搜索结果相关联的系统,方法和其他实施例。 一个示例系统实施例包括从搜索逻辑接收身份指示符的逻辑。 身份指示符与搜索项目相关联,搜索逻辑确定与搜索请求相关。 该示例系统还可以包括用于基于将身份指示符与与另一搜索结果相关联的另一身份指示符进行比较来确定与身份指示符相关联的搜索结果是否是重复结果的逻辑。

    Document summarization
    20.
    发明申请
    Document summarization 有权
    文件总结

    公开(公告)号:US20080109399A1

    公开(公告)日:2008-05-08

    申请号:US11647871

    申请日:2006-12-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30719

    摘要: Systems, methods, and other embodiments associated with automatically summarizing a document are described. One method embodiment includes computing term scores for members of a set of terms in a document to be summarized and computing sentence scores for sentences in a set of sentences in the document. The method embodiment also includes computing a set of entries for a term-sentence matrix that relates terms to sentences. The method embodiment also includes computing a dominant topic for the document and simultaneously ranking the set of terms and the set of sentences based on the dominant topic. The method embodiment provides a summarization item(s) selected from the set of terms and/or the set of sentences.

    摘要翻译: 描述与自动总结文档相关联的系统,方法和其他实施例。 一个方法实施例包括计算要汇总的文档中的一组术语的成员的术语分数,以及计算文档中一组句子中的句子的句子分数。 方法实施例还包括计算用于将术语与句子相关联的术语矩阵的条目集合。 该方法实施例还包括计算文档的主导主题,并且基于主题来同时对该组语句和一组句子进行排序。 该方法实施例提供从该组项和/或一组句子中选择的摘要项目。