-
公开(公告)号:US20050108267A1
公开(公告)日:2005-05-19
申请号:US10714541
申请日:2003-11-14
Applicant: Alexander Gibson , Anne Schur , James Brown , Wendy Cowley , Nicholas Cramer , Dennis MCQuerry , Patricia Medvick , Mark Whiting , Marie Whyatt
Inventor: Alexander Gibson , Anne Schur , James Brown , Wendy Cowley , Nicholas Cramer , Dennis MCQuerry , Patricia Medvick , Mark Whiting , Marie Whyatt
CPC classification number: G06F17/30569
Abstract: A system and method for extracting a plurality of structured data from one or more information sources. The method comprises receiving the information sources, receiving at least one pattern descriptor selected from a graphical user interface, and receiving one or more templates with each templates having at least one pattern descriptor. The method then proceeds to apply the one or more templates to the information sources. The method generates the plurality of structured data in a common format by parsing the information sources with the templates. The method stores the structured data in the common format.
Abstract translation: 一种用于从一个或多个信息源提取多个结构化数据的系统和方法。 该方法包括接收信息源,接收从图形用户界面中选择的至少一个模式描述符,以及接收一个或多个模板,每个模板具有至少一个模式描述符。 然后,该方法继续将一个或多个模板应用于信息源。 该方法通过使用模板解析信息源来生成通用格式的多个结构化数据。 该方法以通用格式存储结构化数据。