-
公开(公告)号:US07987416B2
公开(公告)日:2011-07-26
申请号:US11939794
申请日:2007-11-14
IPC分类号: G06F17/00
CPC分类号: G06F17/30864 , G06F17/241
摘要: Embodiments of the present invention include a computer-implemented method of extracting information. In one embodiment, the present invention comprises defining a plurality of reusable operators, wherein each operator performs a predefined information extraction task different from the other operators. Composite annotators may be created by specifying a composition of the reusable operators. Each operator may receive a searchable item, such as a web page or an annotation, and may generate one or more output annotations. The output annotations may be further processed by other reusable operators and the annotations may be stored in a repository for use during a search.
摘要翻译: 本发明的实施例包括提取信息的计算机实现的方法。 在一个实施例中,本发明包括定义多个可重用操作符,其中每个操作者执行与其他操作者不同的预定信息提取任务。 可以通过指定可重用操作符的组合来创建复合注释器。 每个运营商可以接收可搜索的项目,诸如网页或注释,并且可以生成一个或多个输出注释。 输出注释可以由其他可重用操作符进一步处理,并且注释可以存储在存储库中以便在搜索期间使用。