-
公开(公告)号:US20140281878A1
公开(公告)日:2014-09-18
申请号:US14354115
申请日:2011-10-27
申请人: Shahar Golan , Omer Barkol , Ruth Bergman , Ira Cohen , Gal Noy
发明人: Shahar Golan , Omer Barkol , Ruth Bergman , Ira Cohen , Gal Noy
IPC分类号: G06F17/24
CPC分类号: G06F17/241 , G06F16/355
摘要: Methods and systems of aligning annotation of fields of documents are provided. Training information that includes first measurement information pertaining to features of each of a plurality of fields associated with training clusters for documents of a document type is accessed. A first training cluster is annotated with a first name and the second training cluster is annotated with a second name. An electronic classification model is generated based on the training information. Second measurement information for features of fields associated with new clusters of a new document is accessed. Each of the new clusters is automatically annotated based on the second measurement information using the classification model. For example, a first new cluster that has fields of the first field type is annotated with the first name and a second new cluster that has fields of the second field type is annotated with the second name.
摘要翻译: 提供了文档领域对齐注释的方法和系统。 访问包括关于与文档类型的文档的训练集群相关联的多个字段中的每一个的特征的第一测量信息的训练信息。 第一个训练集群用一个名字注释,第二个训练集群用第二个名字注释。 基于培训信息生成电子分类模型。 访问与新文档的新集群相关联的字段的特征的第二测量信息。 基于使用分类模型的第二测量信息自动地对每个新的集群进行注释。 例如,具有第一字段类型的字段的第一个新集群用第一个名称注释,并且具有第二个字段类型的字段的第二个新集群用第二个名称注释。
-
公开(公告)号:US10402484B2
公开(公告)日:2019-09-03
申请号:US14354115
申请日:2011-10-27
申请人: Shahar Golan , Omer Barkol , Ruth Bergman , Ira Cohen , Gal Noy
发明人: Shahar Golan , Omer Barkol , Ruth Bergman , Ira Cohen , Gal Noy
摘要: Methods and systems of aligning annotation of fields of documents are provided. Training information that includes first measurement information pertaining to features of each of a plurality of fields associated with training clusters for documents of a document type is accessed. A first training cluster is annotated with a first name and the second training cluster is annotated with a second name. An electronic classification model is generated based on the training information. Second measurement information for features of fields associated with new clusters of a new document is accessed. Each of the new clusters is automatically annotated based on the second measurement information using the classification model. For example, a first new cluster that has fields of the first field type is annotated with the first name and a second new cluster that has fields of the second field type is annotated with the second name.
-