Generating visualizations of facet values for facets defined over a collection of objects
    1.
    发明授权
    Generating visualizations of facet values for facets defined over a collection of objects 有权
    生成在对象集合上定义的面的facet值的可视化

    公开(公告)号:US09360982B2

    公开(公告)日:2016-06-07

    申请号:US13461650

    申请日:2012-05-01

    IPC分类号: G06F3/048 G06F3/0481

    摘要: Provided are a computer program product, system, and method for generating visualizations of facet values for facets defined over a collection of objects. The objects are processed to determine facet values for the objects for a specified facet. A first visualization is generated of representations of the determined facet values for the objects. User selection is received of one of the facet values represented in the generated first visualization. A determination is made of objects having the user selected facet value and a determination is made of at least one facet value for the specified facet for each of the determined objects having the selected facet value. A second visualization of representations of the determined at least one facet value is generated.

    摘要翻译: 提供了一种用于生成在对象集合上定义的面的面值的可视化的计算机程序产品,系统和方法。 处理对象以确定指定方面的对象的面值。 生成对于对象确定的面值的表示的第一可视化。 接收到在生成的第一可视化中表示的一个面值的用户选择。 确定具有用户选择的面值的对象,并且确定具有所选择的面值的每个确定对象的指定面的至少一个面值。 产生所确定的至少一个面值的表示的第二可视化。

    GENERATING VISUALIZATIONS OF A DISPLAY GROUP OF TAGS REPRESENTING CONTENT INSTANCES IN OBJECTS SATISFYING A SEARCH CRITERIA

    公开(公告)号:US20130212093A1

    公开(公告)日:2013-08-15

    申请号:US13397596

    申请日:2012-02-15

    IPC分类号: G06F17/30

    摘要: Provided are a computer program product, method, and system for rendering search results. A search request is received having a search criteria to perform with respect to objects having content instances. A determination is made of the objects having qualifying content instances that satisfy the search criteria, an attribute value of the qualifying content instances for a specified attribute, and appearance settings for the qualifying content instances based on the determined attribute values. The appearance settings vary based on the attribute values. Tags are generated indicating the content instances and appearance settings for the content instances. A visualization of the tags in a display group are generated to provide visualization of the qualifying content instances in the objects according to the appearance settings, wherein visualizations of the tags is varied based on the determined appearance settings.

    Detecting duplicate documents using classification
    3.
    发明授权
    Detecting duplicate documents using classification 失效
    使用分类检测重复文件

    公开(公告)号:US08180773B2

    公开(公告)日:2012-05-15

    申请号:US12472758

    申请日:2009-05-27

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30707 G06F17/3071

    摘要: Systems, methods and articles of manufacture are disclosed for detecting a duplicate document. A plurality of documents may be assigned to categories, each category corresponding to a collection of duplicates, or near duplicate documents. A new document may be received. The new document may be evaluated against each category to determine a similarity score between the new document and each category. The new document may be identified as a duplicate based on the similarity scores and thresholds for each category. An action may then be performed on the duplicate based on duplication rules. The thresholds and duplication rules may be customized by a user.

    摘要翻译: 公开了用于检测重复文件的系统,方法和制品。 可以将多个文档分配给类别,每个类别对应于重复的集合或近似重复的文档。 可能会收到一个新的文件。 可以针对每个类别评估新文档,以确定新文档和每个类别之间的相似性得分。 可以基于每个类别的相似性得分和阈值将新文档识别为重复。 然后可以基于重复规则对副本执行动作。 阈值和复制规则可以由用户定制。

    CLASSIFICATION OF ELECTRONIC MESSAGES BASED ON CONTENT
    4.
    发明申请
    CLASSIFICATION OF ELECTRONIC MESSAGES BASED ON CONTENT 有权
    基于内容的电子信息分类

    公开(公告)号:US20100235367A1

    公开(公告)日:2010-09-16

    申请号:US12404716

    申请日:2009-03-16

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/30707

    摘要: Classifying electronic mail (e-mail) based on content and predefined categories. Content of a received e-mail may be analyzed to determine one of a plurality of predefined categories into which the e-mail is classified. A relevancy score may also be calculated to indicate the strength of correlation between the e-mail and the category. A user may be allowed to sort e-mails in an e-mail box based on the category names and/or relevancy scores.

    摘要翻译: 根据内容和预定义类别分类电子邮件(电子邮件)。 可以分析接收到的电子邮件的内容以确定电子邮件被分类到的多个预定类别中的一个。 还可以计算相关性分数,以指示电子邮件和类别之间的相关性的强度。 可以允许用户基于类别名称和/或相关性分数在电子邮箱中对电子邮件进行排序。

    GENERATING VISUALIZATIONS OF A DISPLAY GROUP OF TAGS REPRESENTING CONTENT INSTANCES IN OBJECTS SATISFYING A SEARCH CRITERIA
    5.
    发明申请
    GENERATING VISUALIZATIONS OF A DISPLAY GROUP OF TAGS REPRESENTING CONTENT INSTANCES IN OBJECTS SATISFYING A SEARCH CRITERIA 有权
    生成符合搜索标准的目标标签显示组的可视化

    公开(公告)号:US20130212087A1

    公开(公告)日:2013-08-15

    申请号:US13463318

    申请日:2012-05-03

    IPC分类号: G06F17/30

    摘要: Provided is a method for rendering search results. A search request is received having a search criteria to perform with respect to objects having content instances. A determination is made of the objects having qualifying content instances that satisfy the search criteria, an attribute value of the qualifying content instances for a specified attribute, and appearance settings for the qualifying content instances based on the determined attribute values. The appearance settings vary based on the attribute values. Tags are generated indicating the content instances and appearance settings for the content instances. A visualization of the tags in a display group are generated to provide visualization of the qualifying content instances in the objects according to the appearance settings, wherein visualizations of the tags is varied based on the determined appearance settings.

    摘要翻译: 提供了一种呈现搜索结果的方法。 接收到具有关于具有内容实例的对象执行的搜索标准的搜索请求。 基于所确定的属性值,确定具有符合搜索条件的符合条件的内容实例的对象,用于指定属性的合格内容实例的属性值,以及基于确定的属性值的合格内容实例的外观设置。 外观设置根据属性值而有所不同。 生成的标签指示内容实例的内容实例和外观设置。 生成显示组中的标签的可视化,以根据外观设置提供对象中的合格内容实例的可视化,其中标签的可视化基于所确定的外观设置而变化。

    IDENTIFYING TRAINING DOCUMENTS FOR A CONTENT CLASSIFIER
    6.
    发明申请
    IDENTIFYING TRAINING DOCUMENTS FOR A CONTENT CLASSIFIER 有权
    识别内容分类器的培训文档

    公开(公告)号:US20110004573A1

    公开(公告)日:2011-01-06

    申请号:US12497467

    申请日:2009-07-02

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005 G06F17/30707

    摘要: Systems, methods and articles of manufacture are disclosed for identifying a training document for a content classifier. One or more thresholds may be defined for designating a document as a training document for a content classifier. A plurality of documents may be evaluated to compute a score for each respective document. The score may represent suitability of a document for training the content classifier with respect to a category. The score may be computed based on content of the plurality of documents, metadata of the plurality of documents, link structure of the plurality of documents, user feedback (e.g., user supplied document tags) received for the plurality of documents, and document metrics received for the plurality of documents. Based on the computed scores, a training document may be selected. The content classifier may be trained using the selected training document.

    摘要翻译: 公开了用于识别内容分类器的训练文档的系统,方法和制品。 可以定义一个或多个阈值来指定文档作为内容分类器的训练文档。 可以评估多个文档以计算每个相应文档的得分。 该分数可以表示用于针对类别来训练内容分类器的文档的适合性。 可以基于多个文档的内容,多个文档的元数据,多个文档的链接结构,为多个文档接收的用户反馈(例如,用户提供的文档标签)以及接收到的文档度量来计算分数 用于多个文档。 基于计算出的分数,可以选择训练文档。 内容分类器可以使用所选择的训练文档进行训练。

    Monitoring content repositories, identifying misclassified content objects, and suggesting reclassification
    7.
    发明授权
    Monitoring content repositories, identifying misclassified content objects, and suggesting reclassification 有权
    监控内容存储库,识别错误分类的内容对象,以及建议重新分类

    公开(公告)号:US08954437B2

    公开(公告)日:2015-02-10

    申请号:US13444692

    申请日:2012-04-11

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30707

    摘要: Provided is a technique for organizing content objects in an enterprise content management system. Auditing of the content objects is performed to identify one or more content objects that are to be re-classified. A content object is selected. A first category associated with the content object is obtained. A relevancy score is obtained for the first category. A list of candidate categories and relevancy scores for each of the candidate categories are obtained. In response to determining that the first category does not correspond to a candidate category or that the relevancy score does not exceed a threshold, the content object is identified as improperly categorized, and the candidate categories that have associated relevancy scores that exceed the threshold are provided in an audit report.

    摘要翻译: 提供了一种用于在企业内容管理系统中组织内容对象的技术。 执行内容对象的审计以识别要被重新分类的一个或多个内容对象。 选择一个内容对象。 获得与内容对象相关联的第一类别。 获得第一类的相关性分数。 获得每个候选类别的候选类别和相关性得分的列表。 响应于确定第一类别不对应于候选类别或相关性分数不超过阈值,内容对象被识别为不正确地分类,并且提供具有超过阈值的相关联的相关性分数的候选类别 在审计报告中。

    MONITORING CONTENT REPOSITORIES, IDENTIFYING MISCLASSIFIED CONTENT OBJECTS, AND SUGGESTING RECLASSIFICATION
    8.
    发明申请
    MONITORING CONTENT REPOSITORIES, IDENTIFYING MISCLASSIFIED CONTENT OBJECTS, AND SUGGESTING RECLASSIFICATION 有权
    监测内容记录,识别错误的内容对象,并建议重新认证

    公开(公告)号:US20130198193A1

    公开(公告)日:2013-08-01

    申请号:US13358738

    申请日:2012-01-26

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30707

    摘要: Provided are a computer implemented method, computer program product, and system for organizing content objects in an enterprise content management system. Auditing of the content objects is performed to identify one or more content objects that are to be re-classified. A content object is selected. A first category associated with the content object is obtained. A relevancy score is obtained for the first category. A list of candidate categories and relevancy scores for each of the candidate categories are obtained. In response to determining that the first category does not correspond to a candidate category or that the relevancy score does not exceed a threshold, the content object is identified as improperly categorized, and the candidate categories that have associated relevancy scores that exceed the threshold are provided in an audit report.

    摘要翻译: 提供了一种用于在企业内容管理系统中组织内容对象的计算机实现方法,计算机程序产品和系统。 执行内容对象的审计以识别要被重新分类的一个或多个内容对象。 选择一个内容对象。 获得与内容对象相关联的第一类别。 获得第一类的相关性分数。 获得每个候选类别的候选类别和相关性得分的列表。 响应于确定第一类别不对应于候选类别或相关性分数不超过阈值,内容对象被识别为不正确地分类,并且提供具有超过阈值的相关联的相关性分数的候选类别 在审计报告中。

    Generating mappings between a plurality of taxonomies
    9.
    发明授权
    Generating mappings between a plurality of taxonomies 有权
    生成多个分类法之间的映射

    公开(公告)号:US09262506B2

    公开(公告)日:2016-02-16

    申请号:US13474871

    申请日:2012-05-18

    申请人: Barton W. Emanuel

    发明人: Barton W. Emanuel

    IPC分类号: G06F17/30

    摘要: A method, a system and a computer program product create mappings between taxonomies in which documents are classified from a category of a taxonomy to one or more categories within a master taxonomy based on a statistical model and classification score values. The document classifications are analyzed to determine a mapping between the taxonomy category and a corresponding category of the master taxonomy, where the category is mapped to the corresponding category in the master taxonomy in response to sufficient classification score values for the documents.

    摘要翻译: 方法,系统和计算机程序产品在分类法之间创建映射,其中根据统计模型和分类得分值,将文档从分类学的类别分类到主分类中的一个或多个类别。 分析文档分类以确定分类类别和主分类法的相应类别之间的映射,其中类别被映射到主分类中的相应类别以响应文档的足够的分类分数值。

    MONITORING CONTENT REPOSITORIES, IDENTIFYING MISCLASSIFIED CONTENT OBJECTS, AND SUGGESTING RECLASSIFICATION
    10.
    发明申请
    MONITORING CONTENT REPOSITORIES, IDENTIFYING MISCLASSIFIED CONTENT OBJECTS, AND SUGGESTING RECLASSIFICATION 有权
    监测内容记录,识别错误的内容对象,并建议重新认证

    公开(公告)号:US20130198161A1

    公开(公告)日:2013-08-01

    申请号:US13444692

    申请日:2012-04-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30707

    摘要: Provided is a technique for organizing content objects in an enterprise content management system. Auditing of the content objects is performed to identify one or more content objects that are to be re-classified. A content object is selected. A first category associated with the content object is obtained. A relevancy score is obtained for the first category. A list of candidate categories and relevancy scores for each of the candidate categories are obtained. In response to determining that the first category does not correspond to a candidate category or that the relevancy score does not exceed a threshold, the content object is identified as improperly categorized, and the candidate categories that have associated relevancy scores that exceed the threshold are provided in an audit report.

    摘要翻译: 提供了一种用于在企业内容管理系统中组织内容对象的技术。 执行内容对象的审计以识别要被重新分类的一个或多个内容对象。 选择一个内容对象。 获得与内容对象相关联的第一类别。 获得第一类的相关性分数。 获得每个候选类别的候选类别和相关性得分的列表。 响应于确定第一类别不对应于候选类别或相关性分数不超过阈值,内容对象被识别为不正确地分类,并且提供具有超过阈值的相关联的相关性分数的候选类别 在审计报告中。