Architecture of a framework for information extraction from natural language documents

发明授权

US06553385B2 Architecture of a framework for information extraction from natural language documents 失效

标题翻译：从自然语言文件中提取信息的框架架构

请登陆查看更多内容

专利标题： Architecture of a framework for information extraction from natural language documents
专利标题（中）： 从自然语言文件中提取信息的框架架构
申请号： US09145408

申请日： 1998-09-01
公开(公告)号： US06553385B2

公开(公告)日： 2003-04-22
发明人: David E. Johnson , Thomas Hampp-Bahnmueller
申请人： David E. Johnson , Thomas Hampp-Bahnmueller
主分类号： G06F1700
IPC分类号： G06F1700

Architecture of a framework for information extraction from natural language documents

摘要：

A framework for information extraction from natural language documents is application independent and provides a high degree of reusability. The framework integrates different Natural Language/Machine Learning techniques, such as parsing and classification. The architecture of the framework is integrated in an easy to use access layer. The framework performs general information extraction, classification/categorization of natural language documents, automated electronic data transmission (e.g., E-mail and facsimile) processing and routing, and plain parsing. Inside the framework, requests for information extraction are passed to the actual extractors. The framework can handle both pre- and post processing of the application data, control of the extractors, enrich the information extracted by the extractors. The framework can also suggest necessary actions the application should take on the data. To achieve the goal of easy integration and extension, the framework provides an integration (outside) application program interface (API) and an extractor (inside) API.

摘要（中）：

从自然语言文档中提取信息的框架是独立于应用程序，并提供高度的可重用性。该框架集成了不同的自然语言/机器学习技术，如解析和分类。框架的架构集成在易于使用的访问层中。该框架执行一般信息提取，自然语言文档的分类/分类，自动电子数据传输（例如电子邮件和传真）处理和路由以及简单解析。在框架内，将信息提取请求传递给实际的提取器。框架可以处理应用数据的前处理和后处理，提取器的控制，丰富提取器提取的信息。该框架还可以提出应用程序对数据应采取的必要措施。为了实现易于集成和扩展的目标，该框架提供了一个集成（外部）应用程序接口（API）和一个提取器（内部）API。

公开/授权文献

US20020007358A1 ARCHITECURE OF A FRAMEWORK FOR INFORMATION EXTRACTION FROM NATURAL LANGUAGE DOCUMENTS 公开/授权日：2002-01-17

信息查询

Espacenet