System and method for extraction of factoids from textual repositories
    1.
    发明授权
    System and method for extraction of factoids from textual repositories 失效
    从文本库中提取事实的系统和方法

    公开(公告)号:US08706730B2

    公开(公告)日:2014-04-22

    申请号:US11321177

    申请日:2005-12-29

    CPC classification number: G06F17/30864 G06F17/30705

    Abstract: A method (400) is disclosed of extracting factoids from text repositories, with the factoids being associated with a given factoid category. The method (400) starts by training a classifier (230) to recognize factoids relevant to that given factoid category. Documents or document summaries relevant to the given factoid category is next collected (410) from the text repositories. Sentences having a predetermined association to the given factoid category is extracted (420) from the documents or said document summaries. Those sentences are classified (440), in a noisy environment, using the classifier (230) to extract snippets containing phrases relevant to the given factoid category. It is the extracted snippets that are the factoid associated with the given factoid category.

    Abstract translation: 公开了一种从文本存储库中提取事实框架的方法(400),其中事实框架与给定的类别类别相关联。 方法(400)通过训练分类器(230)开始,以识别与该给定的类别类别相关的因子。 接下来从文本存储库收集与文件类型相关的文档或文档摘要(410)。 具有与给定类别类别的预定关联的句子从文档或所述文档摘要中提取(420)。 这些句子在嘈杂的环境中被分类(440),使用分类器(230)提取包含与给定类别类别相关的短语的片段。 提取的片段是与给定类实体类别相关联的实例。

    Automated process for identifying and delivering domain specific unstructured content for advanced business analysis
    2.
    发明授权
    Automated process for identifying and delivering domain specific unstructured content for advanced business analysis 有权
    用于识别和提供域特定非结构化内容以进行高级业务分析的自动化过程

    公开(公告)号:US07657585B2

    公开(公告)日:2010-02-02

    申请号:US11257880

    申请日:2005-10-25

    CPC classification number: G06F17/30616 Y10S707/99956

    Abstract: A cost efficient solution for supporting and deploying custom text analytics applications suited is to provide third party application developers a sand-boxed application development environment such as an appliance computer system, allowing users to leverage data integration, indexing and pre-existing mining platform capabilities for a domain-specific data. Thus, embodiments herein present a system, method, etc. for identifying and delivering domain specific unstructured content for advanced business analysis. The system generally comprises a cluster computer system, a gateway computer system and an appliance computer system.

    Abstract translation: 支持和部署自定义文本分析应用程序的成本效益高的解决方案是为第三方应用程序开发人员提供沙盒应用程序开发环境,例如设备计算机系统,允许用户利用数据集成,索引和预先存在的挖掘平台功能 一个域特定的数据。 因此,本文中的实施例提出了用于识别和递送用于高级业务分析的域特异性非结构化内容的系统,方法等。 该系统通常包括集群计算机系统,网关计算机系统和设备计算机系统。

    Method for database assisted file system restore
    3.
    发明授权
    Method for database assisted file system restore 有权
    数据库辅助文件系统恢复方法

    公开(公告)号:US06496944B1

    公开(公告)日:2002-12-17

    申请号:US09413368

    申请日:1999-10-06

    CPC classification number: G06F11/1435 G06F11/1471

    Abstract: Recovery of a filesystem directory structure is performed to restore it to any point in time and also to synchronize a database restore and a filesystem restore to bring the two restores to a database consistent state. A database management system (DBMS) manages external files and hierarchical directory structures to enable recovery and reconciliation of the files and filesystems, under DBMS control, after filesystem crashes. First, a database table, which recorded previous directory creations and deletions, is used to rebuild a filesystem's directory structure to any previous database state and then external file link information is used to restore files to that same state.

    Abstract translation: 执行文件系统目录结构的恢复,以将其还原到任何时间点,并且还可以同步数据库还原和文件系统还原,以使两个还原到数据库一致状态。 数据库管理系统(DBMS)管理外部文件和分层目录结构,以在文件系统崩溃后,在DBMS控制下启用文件和文件系统的恢复和协调。 首先,记录以前的目录创建和删除的数据库表用于将文件系统的目录结构重建为任何先前的数据库状态,然后使用外部文件链接信息将文件还原到同一状态。

    System and method for extraction of factoids from textual repositories
    4.
    发明申请
    System and method for extraction of factoids from textual repositories 失效
    从文本库中提取事实的系统和方法

    公开(公告)号:US20070162447A1

    公开(公告)日:2007-07-12

    申请号:US11321177

    申请日:2005-12-29

    CPC classification number: G06F17/30864 G06F17/30705

    Abstract: A method (400) is disclosed of extracting factoids from text repositories, with the factoids being associated with a given factoid category. The method (400) starts by training a classifier (230) to recognise factoids relevant to that given factoid category. Documents or document summaries relevant to the given factoid category is next collected (410) from the text repositories. Sentences having a predetermined association to the given factoid category is extracted (420) from the documents or said document summaries. Those sentences are classified (440), in a noisy environment, using the classifier (230) to extract snippets containing phrases relevant to the given factoid category. It is the extracted snippets that are the factoid associated with the given factoid category.

    Abstract translation: 公开了一种从文本存储库中提取事实框架的方法(400),其中事实框架与给定的类别类别相关联。 方法(400)通过训练分类器(230)开始,以识别与该给定的类别类别相关的因子。 接下来从文本存储库收集与文件类型相关的文档或文档摘要(410)。 具有与给定类别类别的预定关联的句子从文档或所述文档摘要中提取(420)。 这些句子在嘈杂的环境中被分类(440),使用分类器(230)提取包含与给定类别类别相关的短语的片段。 提取的片段是与给定类实体类别相关联的实例。

    System and method for parallelizing file archival and retrieval
    5.
    发明授权
    System and method for parallelizing file archival and retrieval 失效
    并行文件归档和检索的系统和方法

    公开(公告)号:US06772177B2

    公开(公告)日:2004-08-03

    申请号:US09872088

    申请日:2001-06-01

    Abstract: A database management system and associated methods for parallelizing file archival and retrieval in an extended database management system. The system includes a set of copy agents that selectively acquire the backup tasks from a copy queue, and a set of retrieval agents that selectively acquire the restore tasks from a restore queue. The chances of contention between any two copy agents or any two retrieve agents acquiring the same copy or restore task is significantly minimized. Once specific copy agents are assigned backup tasks, the backup process is implemented to determine the optimal way to write the backup files to one or more targets, in parallel. In addition, the present system enables the efficient and expeditious retrieval of the desired files without having to search all the targets.

    Abstract translation: 一种用于在扩展数据库管理系统中并行化文件归档和检索的数据库管理系统和相关方法。 该系统包括一组从代理队列选择性地获取备​​份任务的复制代理,以及一组从还原队列中选择性地获取还原任务的检索代理。 任何两个复制代理或获取相同复制或还原任务的任何两个检索代理之间的竞争机会显着地最小化。 一旦为特定的副本代理分配备份任务,则实施备份过程以确定并行将备份文件写入一个或多个目标的最佳方式。 此外,本系统能够有效和快速地检索所需文件,而无需搜索所有目标。

    Automated process for identifying and delivering domain specific unstructured content for advanced business analysis
    6.
    发明申请
    Automated process for identifying and delivering domain specific unstructured content for advanced business analysis 有权
    用于识别和提供域特定非结构化内容以进行高级业务分析的自动化过程

    公开(公告)号:US20070100914A1

    公开(公告)日:2007-05-03

    申请号:US11257880

    申请日:2005-10-25

    CPC classification number: G06F17/30616 Y10S707/99956

    Abstract: A cost efficient solution for supporting and deploying custom text analytics applications suited is to provide third party application developers a sand-boxed application development environment such as an appliance computer system, allowing users to leverage data integration, indexing and pre-existing mining platform capabilities for a domain-specific data. Thus, embodiments herein present a system, method, etc. for identifying and delivering domain specific unstructured content for advanced business analysis. The system generally comprises a cluster computer system, a gateway computer system and an appliance computer system.

    Abstract translation: 支持和部署自定义文本分析应用程序的成本效益高的解决方案是为第三方应用程序开发人员提供沙盒应用程序开发环境,例如设备计算机系统,允许用户利用数据集成,索引和预先存在的挖掘平台功能 一个域特定的数据。 因此,本文中的实施例提出了用于识别和递送用于高级业务分析的域特异性非结构化内容的系统,方法等。 该系统通常包括集群计算机系统,网关计算机系统和设备计算机系统。

    System and method for on-demand analysis of unstructured text data returned from a database
    7.
    发明申请
    System and method for on-demand analysis of unstructured text data returned from a database 审中-公开
    从数据库返回的非结构化文本数据的按需分析的系统和方法

    公开(公告)号:US20060248087A1

    公开(公告)日:2006-11-02

    申请号:US11118538

    申请日:2005-04-29

    CPC classification number: G06F16/3344

    Abstract: A system and method of retrieving data from a database comprising unstructured data comprises specifying a text analytic component in an unstructured text query at query runtime; submitting the unstructured text query to a web service database; filtering unstructured text data in the web service database based on constraints defined in the text analytic component in the query; and receiving the filtered unstructured text data based on the submitted query from the web service database, wherein the text analytic component comprises metadata requirements. Preferably, the constraints comprise any of positive sentiments regarding an unstructured text document and negative sentiments regarding the unstructured text document. Alternatively, the constraints may comprise any of name spotting constraints, address spotting constraints, date spotting constraints, and entity spotting constraints. The filtering preferably occurs using a web-based callback service specified in a WFQL XML document. The database is preferably run on a WebFountain platform.

    Abstract translation: 从包括非结构化数据的数据库检索数据的系统和方法包括在查询运行时指定非结构化文本查询中的文本分析组件; 将非结构化文本查询提交到Web服务数据库; 基于查询中文本分析组件中定义的约束,过滤Web服务数据库中的非结构化文本数据; 以及基于从所述Web服务数据库提交的查询接收所述经过过滤的非结构化文本数据,其中所述文本分析组件包括元数据要求。 优选地,约束包括关于非结构化文本文档的任何正面情绪以及关于非结构化文本文档的消极情绪。 或者,约束可以包括任何名称检测约束,地址检测约束,日期检测约束和实体检测约束。 过滤优先使用WFQL XML文档中指定的基于Web的回调服务进行。 数据库最好在WebFountain平台上运行。

Patent Agency Ranking