CLUSTERING A COLLECTION USING AN INVERTED INDEX OF FEATURES
    1.
    发明申请
    CLUSTERING A COLLECTION USING AN INVERTED INDEX OF FEATURES 审中-公开
    使用反转的特征索引集合收集

    公开(公告)号:US20120150867A1

    公开(公告)日:2012-06-14

    申请号:US12966698

    申请日:2010-12-13

    IPC分类号: G06F17/30

    摘要: Provided are techniques for creating an inverted index for features of a set of data elements, wherein each of the data elements is represented by a vector of features, wherein the inverted index, when queried with a feature, outputs one or more data elements containing the feature. The features of the set of data elements are ranked. For each feature in the ranked list, the inverted index is queried for data elements having the feature and not having any previously selected feature and a cluster of the data elements is created based on results returned in response to the query.

    摘要翻译: 提供了用于为一组数据元素的特征创建反向索引的技术,其中每个数据元素由特征向量表示,其中当用特征查询时,反向索引输出一个或多个包含 特征。 该组数据元素的特征被排序。 对于排序列表中的每个特征,对具有该特征并且没有任何先前选择的特征的数据元素查询反向索引,并且基于响应于该查询返回的结果来创建数据元素的集群。

    Text search quality by exploiting organizational information
    2.
    发明申请
    Text search quality by exploiting organizational information 审中-公开
    通过利用组织信息的文本搜索质量

    公开(公告)号:US20060129538A1

    公开(公告)日:2006-06-15

    申请号:US11295397

    申请日:2005-12-05

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: Techniques are provided for electronic Information Retrieval (IR) applied for an electronic search in a search environment. At indexing time, a searched document is mapped to at least one element of an organizational structure of an enterprise associated with the search environment. At query time, a querying user is associated with at least one element of the organizational structure of the enterprise. The organizational information of the searched document and that of the querying user are compared. A higher rank is provided to the searched document when the searched document has a closer organizational relation to the querying user compared to other searched documents with a less close relation to the querying user based on the compared organizational information.

    摘要翻译: 提供了在搜索环境中应用于电子搜索的电子信息检索(IR)技术。 在索引时间,搜索的文档被映射到与搜索环境相关联的企业的组织结构的至少一个元素。 在查询时,查询用户与企业的组织结构的至少一个元素相关联。 比较搜索文档和查询用户的组织信息。 基于所比较的组织信息,当搜索到的文档与查询用户具有更接近的与查询用户的组织关系时,与提供与查询用户的关系较小的其他搜索文档相比,提供了更高的等级。

    Enhanced content web browsing
    4.
    发明授权
    Enhanced content web browsing 失效
    增强内容网页浏览

    公开(公告)号:US08543571B2

    公开(公告)日:2013-09-24

    申请号:US12350498

    申请日:2009-01-08

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30905

    摘要: An embodiment of a method for enhanced content browsing includes loading a web page in a user interface; detecting entities of a first specified type in the web page by an analysis service; tagging the detected entities in the web page; calling an action service associated with the analysis service when a detected entity is activated; and displaying a result of the action service in the user interface. Embodiments of systems for enhanced content browsing are also provided.

    摘要翻译: 用于增强内容浏览的方法的实施例包括在用户界面中加载网页; 通过分析服务检测网页中的第一指定类型的实体; 在网页中标记检测到的实体; 当检测到的实体被激活时,呼叫与分析服务相关联的动作服务; 并在用户界面中显示动作服务的结果。 还提供了用于增强内容浏览的系统的实施例。

    SYSTEM AND METHOD FOR SOCIAL BOOKMARKING/TAGGING AT A SUB-DOCUMENT AND CONCEPT LEVEL
    5.
    发明申请
    SYSTEM AND METHOD FOR SOCIAL BOOKMARKING/TAGGING AT A SUB-DOCUMENT AND CONCEPT LEVEL 审中-公开
    在文件和概念层面上进行社会书签/标记的系统和方法

    公开(公告)号:US20100306307A1

    公开(公告)日:2010-12-02

    申请号:US12475550

    申请日:2009-05-31

    IPC分类号: G06F17/00 G06F15/16 G06F17/30

    CPC分类号: G06F16/986

    摘要: According to one embodiment of the present invention, a method for social bookmarking and tagging documents is provided. According to one embodiment of the present invention, a method comprises receiving a new document in a tagging server having a storage unit with stored tags associated with a preexisting document and comparing the new document with the tags using a processor to find matching instances between parts of the new document and the tags. Each matching instance in the new document is marked with tag information. The marked up new document is delivered for display on a display unit.

    摘要翻译: 根据本发明的一个实施例,提供了一种用于社会书签和标签文档的方法。 根据本发明的一个实施例,一种方法包括在具有存储单元的标签服务器中接收新文档,所述存储单元具有与预先存在的文档相关联的存储标签,并且使用处理器将新文档与标签进行比较,以在 新文件和标签。 新文档中的每个匹配实例都标记了标签信息。 标记的新文档交付显示在显示单元上。

    Architecture of a framework for information extraction from natural language documents
    6.
    发明授权
    Architecture of a framework for information extraction from natural language documents 失效
    从自然语言文件中提取信息的框架架构

    公开(公告)号:US06553385B2

    公开(公告)日:2003-04-22

    申请号:US09145408

    申请日:1998-09-01

    IPC分类号: G06F1700

    摘要: A framework for information extraction from natural language documents is application independent and provides a high degree of reusability. The framework integrates different Natural Language/Machine Learning techniques, such as parsing and classification. The architecture of the framework is integrated in an easy to use access layer. The framework performs general information extraction, classification/categorization of natural language documents, automated electronic data transmission (e.g., E-mail and facsimile) processing and routing, and plain parsing. Inside the framework, requests for information extraction are passed to the actual extractors. The framework can handle both pre- and post processing of the application data, control of the extractors, enrich the information extracted by the extractors. The framework can also suggest necessary actions the application should take on the data. To achieve the goal of easy integration and extension, the framework provides an integration (outside) application program interface (API) and an extractor (inside) API.

    摘要翻译: 从自然语言文档中提取信息的框架是独立于应用程序,并提供高度的可重用性。 该框架集成了不同的自然语言/机器学习技术,如解析和分类。 框架的架构集成在易于使用的访问层中。 该框架执行一般信息提取,自然语言文档的分类/分类,自动电子数据传输(例如电子邮件和传真)处理和路由以及简单解析。 在框架内,将信息提取请求传递给实际的提取器。 框架可以处理应用数据的前处理和后处理,提取器的控制,丰富提取器提取的信息。 该框架还可以提出应用程序对数据应采取的必要措施。 为了实现易于集成和扩展的目标,该框架提供了一个集成(外部)应用程序接口(API)和一个提取器(内部)API。

    Enhanced Content Web Browsing
    7.
    发明申请
    Enhanced Content Web Browsing 失效
    增强内容网页浏览

    公开(公告)号:US20100174713A1

    公开(公告)日:2010-07-08

    申请号:US12350498

    申请日:2009-01-08

    IPC分类号: G06F7/06 G06F3/00 G06F17/30

    CPC分类号: G06F17/30905

    摘要: An embodiment of a method for enhanced content browsing includes loading a web page in a user interface; detecting entities of a first specified type in the web page by an analysis service; tagging the detected entities in the web page; calling an action service associated with the analysis service when a detected entity is activated; and displaying a result of the action service in the user interface. Embodiments of systems for enhanced content browsing are also provided.

    摘要翻译: 用于增强内容浏览的方法的实施例包括在用户界面中加载网页; 通过分析服务检测网页中的第一指定类型的实体; 在网页中标记检测到的实体; 当检测到的实体被激活时,呼叫与分析服务相关联的动作服务; 并在用户界面中显示动作服务的结果。 还提供了用于增强内容浏览的系统的实施例。