-
公开(公告)号:US20190286692A1
公开(公告)日:2019-09-19
申请号:US16270621
申请日:2019-02-08
Applicant: Hitachi, Ltd.
Inventor: Ryosuke ODATE , Hiroshi SHINJO , Yasufumi SUZUKI , Masahiro MOTOBAYASHI
Abstract: A computing machine managing a template manages a document format, a template, and a cluster generated on the basis of a classification result based on a position of the document in a feature space in such a manner that they correspond to one another; determines whether a cluster to which a target document can belong is present on the basis of a position, in the feature space, of the target document in a case of detecting an opportunity of generating a template of the target document; registers the template of the target document in a case of determining that the cluster to which the target document can belong is not present; generates a cluster corresponding to the registered template; and manages a document format of the target document, the registered template, and the generated cluster in such a manner that they correspond to one another.
-
公开(公告)号:US20180349693A1
公开(公告)日:2018-12-06
申请号:US15918830
申请日:2018-03-12
Applicant: HITACHI, LTD.
Inventor: Yasuo WATANABE , Toshio OKOCHI , Hiroshi SHINJO , Masahiro MOTOBAYASHI , Yasufumi SUZUKI
IPC: G06K9/00
CPC classification number: G06K9/00469 , G06K9/00463 , G06K9/00483 , G06K9/6202 , G06K2209/01 , G06Q30/04
Abstract: A computer, which is configured to extract an attribute being a character string indicating a feature of a paper-based document, the computer stores template information dictionary information. The computer is configured to: execute character recognition processing on image data on the paper-based document; extract an attribute corresponding to each of the at least one type of attribute, which is defined in each of the plurality of templates, through use of a result of the character recognition processing and the plurality of templates; calculate a score regarding the extracted attribute for each of the plurality of templates; select one of the plurality of templates that has the highest extraction accuracy of the attribute based on the score; and generate output information through use of the selected template.
-