METHOD AND SYSTEM FOR DETECTING WHEN AN OUTGOING COMMUNICATION CONTAINS CERTAIN CONTENT
    61.
    发明申请
    METHOD AND SYSTEM FOR DETECTING WHEN AN OUTGOING COMMUNICATION CONTAINS CERTAIN CONTENT 有权
    当出现通信包含某些内容时检测的方法和系统

    公开(公告)号:US20090313706A1

    公开(公告)日:2009-12-17

    申请号:US12510186

    申请日:2009-07-27

    摘要: A method and system for detecting whether an outgoing communication contains confidential information or other target information is provided. The detection system is provided with a collection of documents that contain confidential information, referred to as “confidential documents.” When the detection system is provided with an outgoing communication, it compares the content of the outgoing communication to the content of the confidential documents. If the outgoing communication contains confidential information, then the detection system may prevent the outgoing communication from being sent outside the organization. The detection system detects confidential information based on the similarity between the content of an outgoing communication and the content of confidential documents that are known to contain confidential information.

    摘要翻译: 提供一种用于检测输出通信是否包含机密信息或其他目标信息的方法和系统。 检测系统提供了一系列包含机密信息的文件,称为“机密文件”。 当向检测系统提供传出通信时,将传出通信的内容与机密文档的内容进行比较。 如果传出通信包含机密信息,则检测系统可以防止传出通信被发送到组织外部。 检测系统基于传出通信的内容与已知包含机密信息的机密文档的内容之间的相似性来检测机密信息。

    Query-based snippet clustering for search result grouping
    62.
    发明授权
    Query-based snippet clustering for search result grouping 有权
    基于查询的片段聚类,用于搜索结果分组

    公开(公告)号:US07617176B2

    公开(公告)日:2009-11-10

    申请号:US10889841

    申请日:2004-07-13

    IPC分类号: G06F7/00

    摘要: A clustering architecture that dynamically groups the search result documents into clusters labeled by phrases extracted from the search result snippets. Documents related to the same topic usually share a common vocabulary. The words are first clustered based on their co-occurrences and each cluster forms a potentially interesting topic. Keywords are chosen and then clustered by counting co-occurrences of pairs of keywords. Documents are assigned to relevant topics based on the feature vectors of the clusters.

    摘要翻译: 将搜索结果文档动态地分组到由从搜索结果片段中提取的短语标签的聚类体系结构。 与同一主题相关的文件通常共享一个共同的词汇。 这些单词首先基于它们的共同出现而聚集,并且每个集合形成潜在有趣的主题。 选择关键词,然后通过计算关键字对的共同出现来聚类。 基于集群的特征向量将文档分配给相关主题。

    METHOD AND SYSTEM FOR MINING INFORMATION BASED ON RELATIONSHIPS
    63.
    发明申请
    METHOD AND SYSTEM FOR MINING INFORMATION BASED ON RELATIONSHIPS 有权
    基于关系挖掘信息的方法与系统

    公开(公告)号:US20090228452A1

    公开(公告)日:2009-09-10

    申请号:US12406039

    申请日:2009-03-17

    IPC分类号: G06F17/30

    摘要: A method and system for identifying information about people is provided. The information system identifies groups of people that have relationships based on their relationships to documents or more generally to objects. The information system initially is provided with an indication of which people have which relationships to which documents. The information system then identifies clusters of people based on having a relationship to the same objects. The information system may also identify clusters of related objects associated with a cluster of people. When a user wants to identify information about a person, the user can provide the name of that person to the information system. The information system then can retrieve and display the names of the other people who are in the same cluster as the person.

    摘要翻译: 提供了一种用于识别人的信息的方法和系统。 信息系统根据与文档的关系或更一般地与对象的关系来识别具有关系的人群。 信息系统最初被提供指示哪些人与哪些文档有哪些关系。 然后,信息系统基于与相同对象的关系来识别人群。 信息系统还可以识别与一群人相关联的相关对象的群集。 当用户想要识别关于人的信息时,用户可以向该信息系统提供该人的姓名。 然后,信息系统可以检索和显示与该人在同一集群中的其他人的姓名。

    Implicit links search enhancement system and method for search engines using implicit links generated by mining user access patterns
    64.
    发明授权
    Implicit links search enhancement system and method for search engines using implicit links generated by mining user access patterns 有权
    使用由采矿用户访问模式生成的隐式链接的搜索引擎的隐式链接搜索增强系统和方法

    公开(公告)号:US07584181B2

    公开(公告)日:2009-09-01

    申请号:US10676794

    申请日:2003-09-30

    IPC分类号: G06F7/00 G06F17/30

    摘要: An implicit links enhancement system and method for search engines that generates implicit links obtained from mining user access logs to facilitate enhanced local searching of web sites and intranets. The implicit links search enhancement system and method includes extracting implicit links by mining users' access patterns and then using a modified link analysis algorithm to re-rank search results obtained from traditional search engines. More specifically, the implicit links search enhancement method includes extracting implicit links from a user access log, generating an implicit links graph from the extracted implicit links, and computing page rankings using the implicit links graph. The implicit links are extracted from the log using a two-item sequential pattern mining technique. Search results obtained from a search engine are re-ranked based on an implicit links analysis performed using an updated implicit links graph, a modified re-ranking formula, and at least one re-ranking technique.

    摘要翻译: 一种用于搜索引擎的隐式链接增强系统和方法,用于生成从挖掘用户访问日志中获取的隐含链接,以促进对网站和内部网的增强的本地搜索。 隐式链接搜索增强系统和方法包括通过挖掘用户访问模式提取隐含链接,然后使用修改的链接分析算法对从传统搜索引擎获取的搜索结果进行重新排序。 更具体地,隐式链接搜索增强方法包括从用户访问日志提取隐含链接,从提取的隐式链接生成隐式链接图,以及使用隐式链接图计算页面排名。 使用两项顺序模式挖掘技术从日志中提取隐式链接。 基于使用更新的隐式链接图,修改的重新排列公式和至少一个重新排序技术执行的隐式链接分析,从搜索引擎获得的搜索结果被重新排序。

    Method and system for classifying display pages using summaries
    65.
    发明授权
    Method and system for classifying display pages using summaries 有权
    使用汇总分类显示页面的方法和系统

    公开(公告)号:US07392474B2

    公开(公告)日:2008-06-24

    申请号:US10836319

    申请日:2004-04-30

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30719 G06F17/30864

    摘要: A method and system for classifying display pages based on automatically generated summaries of display pages. A web page classification system uses a web page summarization system to generate summaries of web pages. The summary of a web page may include the sentences of the web page that are most closely related to the primary topic of the web page. The summarization system may combine the benefits of multiple summarization techniques to identify the sentences of a web page that represent the primary topic of the web page. Once the summary is generated, the classification system may apply conventional classification techniques to the summary to classify the web page. The classification system may use conventional classification techniques such as a Naïve Bayesian classifier or a support vector machine to identify the classifications of a web page based on the summary generated by the summarization system.

    摘要翻译: 一种基于自动生成的显示页面摘要来分类显示页面的方法和系统。 网页分类系统使用网页摘要系统来生成网页摘要。 网页的摘要可以包括与网页的主要主题最密切相关的网页的句子。 总结系统可以结合多个汇总技术的优点来识别代表网页的主要主题的网页的句子。 一旦生成摘要,分类系统可以将常规分类技术应用于摘要以对网页进行分类。 分类系统可以使用诸如朴素贝叶斯分类器或支持向量机的常规分类技术来基于由汇总系统生成的摘要来识别网页的分类。

    Method and system for calculating document importance using document classifications
    67.
    发明申请
    Method and system for calculating document importance using document classifications 有权
    使用文件分类计算文件重要性的方法和系统

    公开(公告)号:US20060004809A1

    公开(公告)日:2006-01-05

    申请号:US10881812

    申请日:2004-06-30

    IPC分类号: G06F17/00

    摘要: A system for calculating the importance of web pages is provided. The web pages are organized hierarchically into collections. The system calculates the importance of each collection based on inter-collection links from a web page in one collection to a web page in another collection. The system then calculates the importance of web pages in the collections with a high calculated importance based on links between the web pages in those collections using, for example, a conventional page rank algorithm. The system may also calculate the importance of web pages in each collection with a low calculated importance separately based on the links between the web pages in the collection using, for example, a conventional page rank algorithm.

    摘要翻译: 提供了一种用于计算网页重要性的系统。 网页分层组织成集合。 系统基于从一个集合中的网页到另一个集合中的网页的集合间链接来计算每个集合的重要性。 然后,该系统使用例如常规页面排序算法,基于这些集合中的网页之间的链接来计算具有高计算重要性的集合中的网页的重要性。 该系统还可以基于使用例如常规页面排序算法的集合中的网页之间的链接来分别计算具有低计算重要性的每个集合中的网页的重要性。

    Implicit links search enhancement system and method for search engines using implicit links generated by mining user access patterns
    69.
    发明申请
    Implicit links search enhancement system and method for search engines using implicit links generated by mining user access patterns 有权
    使用由采矿用户访问模式生成的隐式链接的搜索引擎的隐式链接搜索增强系统和方法

    公开(公告)号:US20050071465A1

    公开(公告)日:2005-03-31

    申请号:US10676794

    申请日:2003-09-30

    IPC分类号: G06F15/173 G06F17/30

    摘要: An implicit links enhancement system and method for search engines that generates implicit links obtained from mining user access logs to facilitate enhanced local searching of web sites and intranets. The implicit links search enhancement system and method includes extracting implicit links by mining users' access patterns and then using a modified link analysis algorithm to re-rank search results obtained from traditional search engines. More specifically, the implicit links search enhancement method includes extracting implicit links from a user access log, generating an implicit links graph from the extracted implicit links, and computing page rankings using the implicit links graph. The implicit links are extracted from the log using a two-item sequential pattern mining technique. Search results obtained from a search engine are re-ranked based on an implicit links analysis performed using an updated implicit links graph, a modified re-ranking formula, and at least one re-ranking technique.

    摘要翻译: 一种用于搜索引擎的隐式链接增强系统和方法,用于生成从挖掘用户访问日志中获取的隐含链接,以促进对网站和内部网的增强的本地搜索。 隐式链接搜索增强系统和方法包括通过挖掘用户访问模式提取隐含链接,然后使用修改的链接分析算法对从传统搜索引擎获取的搜索结果进行重新排序。 更具体地,隐式链接搜索增强方法包括从用户访问日志提取隐含链接,从提取的隐式链接生成隐式链接图,以及使用隐式链接图计算页面排名。 使用两项顺序模式挖掘技术从日志中提取隐式链接。 基于使用更新的隐式链接图,修改的重新排列公式和至少一个重新排序技术执行的隐式链接分析,从搜索引擎获得的搜索结果被重新排序。

    Method and system for detecting when an outgoing communication contains certain content
    70.
    发明授权
    Method and system for detecting when an outgoing communication contains certain content 有权
    用于检测输出通信何时包含某些内容的方法和系统

    公开(公告)号:US08782805B2

    公开(公告)日:2014-07-15

    申请号:US12510186

    申请日:2009-07-27

    IPC分类号: H04L29/06

    摘要: A method and system for detecting whether an outgoing communication contains confidential information or other target information is provided. The detection system is provided with a collection of documents that contain confidential information, referred to as “confidential documents.” When the detection system is provided with an outgoing communication, it compares the content of the outgoing communication to the content of the confidential documents. If the outgoing communication contains confidential information, then the detection system may prevent the outgoing communication from being sent outside the organization. The detection system detects confidential information based on the similarity between the content of an outgoing communication and the content of confidential documents that are known to contain confidential information.

    摘要翻译: 提供一种用于检测输出通信是否包含机密信息或其他目标信息的方法和系统。 该检测系统具有包含被称为“机密文件”的机密信息的文件的集合。当向检测系统提供传出通信时,将传出通信的内容与机密文档的内容进行比较。 如果传出通信包含机密信息,则检测系统可以防止传出通信被发送到组织外部。 检测系统基于传出通信的内容与已知包含机密信息的机密文档的内容之间的相似性来检测机密信息。