Content propagation for enhanced document retrieval
    1.
    发明授权
    Content propagation for enhanced document retrieval 失效
    增强文档检索的内容传播

    公开(公告)号:US07305389B2

    公开(公告)日:2007-12-04

    申请号:US10826161

    申请日:2004-04-15

    IPC分类号: G06F17/30

    摘要: Systems and methods providing computer-implemented content propagation for enhanced document retrieval are described. In one aspect, reference information directed to one or more documents is identified. The reference information is identified from one or more sources of data that are independent of a data source that includes the one or more documents. Metadata that is proximally located to the reference information is extracted from the one or more sources of data. Relevance between respective features of the metadata to content of associated ones of the one or more documents is calculated. For each document of the one or more documents, associated portions of the metadata is indexed with the relevance of features from the respective portions into original content of the document. The indexing generates one or more enhanced documents.

    摘要翻译: 描述了提供用于增强文档检索的计算机实现的内容传播的系统和方法。 在一个方面,指定针对一个或多个文档的参考信息。 参考信息从一个或多个独立于包括一个或多个文档的数据源的数据来源识别。 从一个或多个数据来源提取近端位于参考信息的元数据。 计算元数据的各个特征与一个或多个文档中相关联的内容的相关性。 对于一个或多个文档的每个文档,将元数据的相关部分与来自相应部分的特征与文档的原始内容的相关性进行索引。 索引生成一个或多个增强文档。

    Content propagation for enhanced document retrieval
    3.
    发明申请
    Content propagation for enhanced document retrieval 失效
    增强文档检索的内容传播

    公开(公告)号:US20050234952A1

    公开(公告)日:2005-10-20

    申请号:US10826161

    申请日:2004-04-15

    IPC分类号: G06F19/00 G06F17/30

    摘要: Systems and methods providing computer-implemented content propagation for enhanced document retrieval are described. In one aspect, reference information directed to one or more documents is identified. The reference information is identified from one or more sources of data that are independent of a data source that includes the one or more documents. Metadata that is proximally located to the reference information is extracted from the one or more sources of data. Relevance between respective features of the metadata to content of associated ones of the one or more documents is calculated. For each document of the one or more documents, associated portions of the metadata is indexed with the relevance of features from the respective portions into original content of the document. The indexing generates one or more enhanced documents.

    摘要翻译: 描述了提供用于增强文档检索的计算机实现的内容传播的系统和方法。 在一个方面,指定针对一个或多个文档的参考信息。 参考信息从一个或多个独立于包括一个或多个文档的数据源的数据来源识别。 从一个或多个数据来源提取近端位于参考信息的元数据。 计算元数据的各个特征与一个或多个文档中相关联的内容的相关性。 对于一个或多个文档的每个文档,将元数据的关联部分与来自相应部分的特征与文档的原始内容的相关性进行索引。 索引生成一个或多个增强文档。

    Mining service requests for product support
    5.
    发明申请
    Mining service requests for product support 审中-公开
    采矿服务请求产品支持

    公开(公告)号:US20050234973A1

    公开(公告)日:2005-10-20

    申请号:US10826160

    申请日:2004-04-15

    CPC分类号: G06N5/00 G06N5/02

    摘要: Systems and methods for mining service requests for product support are described. In one aspect, unstructured service requests are converted to one or more structured answer objects. Each structured answer object includes hierarchically structured historic problem diagnosis data. In view of a product problem description, a set of the one or more structured answer data objects is identified. Each structured solution data object in the set includes term(s) and/or phrase(s) related to the product problem description. Historic and hierarchically structured problem diagnosis data from the set is provided to an end-user for product problem diagnosis.

    摘要翻译: 描述了产品支持挖掘服务请求的系统和方法。 在一个方面,非结构化服务请求被转换成一个或多个结构化答案对象。 每个结构化答案对象包括分层结构的历史问题诊断数据。 鉴于产品问题描述,识别一组一个或多个结构化答案数据对象。 该集合中的每个结构化解决方案数据对象包括与产品问题描述相关的术语和/或短语。 将集合中的历史和分层结构的问题诊断数据提供给最终用户进行产品问题诊断。

    Method and system for prioritizing communications based on sentence classifications
    7.
    发明授权
    Method and system for prioritizing communications based on sentence classifications 有权
    基于句子分类优先通信的方法和系统

    公开(公告)号:US08112268B2

    公开(公告)日:2012-02-07

    申请号:US12254796

    申请日:2008-10-20

    IPC分类号: G06F17/28

    CPC分类号: G06F17/30

    摘要: A method and system for prioritizing communications based on classifications of sentences within the communications is provided. A sentence classification system may classify sentences of communications according to various classifications such as “sentence mode.” The sentence classification system trains a sentence classifier using training data and then classifies sentences using the trained sentence classifier. After the sentences of a communication are classified, a document ranking system may generate a rank for the communication based on the classifications of the sentences within the communication. The document ranking system trains a document rank classifier using training data and then calculates the rank of communications using the trained document rank classifier.

    摘要翻译: 提供了一种基于通信内的句子分类来优先化通信的方法和系统。 句子分类系统可以根据诸如“句子模式”的各种分类对通信句进行分类。句子分类系统使用训练数据训练句子分类器,然后使用训练句子分类器对句子进行分类。 在对通信的句子进行分类之后,文档排序系统可以基于通信中的句子的分类来生成用于通信的等级。 文档排序系统使用训练数据训练文档排序分类器,然后使用经过训练的文档排序分类器来计算通信的等级。

    Method and system for ranking documents of a search result to improve diversity and information richness
    8.
    发明授权
    Method and system for ranking documents of a search result to improve diversity and information richness 失效
    搜索结果排序文件的方法和系统,以提高多样性和信息丰富度

    公开(公告)号:US07664735B2

    公开(公告)日:2010-02-16

    申请号:US10837540

    申请日:2004-04-30

    IPC分类号: G06F17/00

    摘要: A method and system for ranking documents of search results based on information richness and diversity of topics. A ranking system determines the information richness of each document within a search result. The ranking system groups documents of a search result based on their relatedness, meaning that they are directed to similar topics. The ranking system ranks the documents to ensure that the highest ranking documents may include at least one document covering each topic, that is, one document from each of the groups. The ranking system selects the document from each group that has the highest information richness of the documents within the group. When the documents are presented to a user in rank order, the user will likely find on the first page of the search result documents that cover a variety of topics, rather than just a single popular topic.

    摘要翻译: 基于信息丰富性和主题多样性对搜索结果文档进行排序的方法和系统。 排名系统确定搜索结果内每个文档的信息丰富度。 排名系统根据其相关性对搜索结果的文档进行分组,这意味着它们针对类似的主题。 排名系统排列文件,以确保最高排名的文档可能包含至少一个涵盖每个主题的文档,即每个组中的一个文档。 排名系统选择组内文件信息丰富度最高的组中的文档。 当文件以等级顺序呈现给用户时,用户可能会在搜索结果文档的第一页上找到涵盖各种主题的文档,而不仅仅是一个流行的主题。

    METHOD AND SYSTEM FOR DETECTING WHEN AN OUTGOING COMMUNICATION CONTAINS CERTAIN CONTENT
    9.
    发明申请
    METHOD AND SYSTEM FOR DETECTING WHEN AN OUTGOING COMMUNICATION CONTAINS CERTAIN CONTENT 有权
    当出现通信包含某些内容时检测的方法和系统

    公开(公告)号:US20090313706A1

    公开(公告)日:2009-12-17

    申请号:US12510186

    申请日:2009-07-27

    摘要: A method and system for detecting whether an outgoing communication contains confidential information or other target information is provided. The detection system is provided with a collection of documents that contain confidential information, referred to as “confidential documents.” When the detection system is provided with an outgoing communication, it compares the content of the outgoing communication to the content of the confidential documents. If the outgoing communication contains confidential information, then the detection system may prevent the outgoing communication from being sent outside the organization. The detection system detects confidential information based on the similarity between the content of an outgoing communication and the content of confidential documents that are known to contain confidential information.

    摘要翻译: 提供一种用于检测输出通信是否包含机密信息或其他目标信息的方法和系统。 检测系统提供了一系列包含机密信息的文件,称为“机密文件”。 当向检测系统提供传出通信时,将传出通信的内容与机密文档的内容进行比较。 如果传出通信包含机密信息,则检测系统可以防止传出通信被发送到组织外部。 检测系统基于传出通信的内容与已知包含机密信息的机密文档的内容之间的相似性来检测机密信息。

    Query-based snippet clustering for search result grouping
    10.
    发明授权
    Query-based snippet clustering for search result grouping 有权
    基于查询的片段聚类,用于搜索结果分组

    公开(公告)号:US07617176B2

    公开(公告)日:2009-11-10

    申请号:US10889841

    申请日:2004-07-13

    IPC分类号: G06F7/00

    摘要: A clustering architecture that dynamically groups the search result documents into clusters labeled by phrases extracted from the search result snippets. Documents related to the same topic usually share a common vocabulary. The words are first clustered based on their co-occurrences and each cluster forms a potentially interesting topic. Keywords are chosen and then clustered by counting co-occurrences of pairs of keywords. Documents are assigned to relevant topics based on the feature vectors of the clusters.

    摘要翻译: 将搜索结果文档动态地分组到由从搜索结果片段中提取的短语标签的聚类体系结构。 与同一主题相关的文件通常共享一个共同的词汇。 这些单词首先基于它们的共同出现而聚集,并且每个集合形成潜在有趣的主题。 选择关键词,然后通过计算关键字对的共同出现来聚类。 基于集群的特征向量将文档分配给相关主题。