Aligning Annotation of Fields of Documents
    1.
    发明申请
    Aligning Annotation of Fields of Documents 审中-公开
    对齐文件领域的注释

    公开(公告)号:US20140281878A1

    公开(公告)日:2014-09-18

    申请号:US14354115

    申请日:2011-10-27

    IPC分类号: G06F17/24

    CPC分类号: G06F17/241 G06F16/355

    摘要: Methods and systems of aligning annotation of fields of documents are provided. Training information that includes first measurement information pertaining to features of each of a plurality of fields associated with training clusters for documents of a document type is accessed. A first training cluster is annotated with a first name and the second training cluster is annotated with a second name. An electronic classification model is generated based on the training information. Second measurement information for features of fields associated with new clusters of a new document is accessed. Each of the new clusters is automatically annotated based on the second measurement information using the classification model. For example, a first new cluster that has fields of the first field type is annotated with the first name and a second new cluster that has fields of the second field type is annotated with the second name.

    摘要翻译: 提供了文档领域对齐注释的方法和系统。 访问包括关于与文档类型的文档的训练集群相关联的多个字段中的每一个的特征的第一测量信息的训练信息。 第一个训练集群用一个名字注释,第二个训练集群用第二个名字注释。 基于培训信息生成电子分类模型。 访问与新文档的新集群相关联的字段的特征的第二测量信息。 基于使用分类模型的第二测量信息自动地对每个新的集群进行注释。 例如,具有第一字段类型的字段的第一个新集群用第一个名称注释,并且具有第二个字段类型的字段的第二个新集群用第二个名称注释。

    Aligning annotation of fields of documents

    公开(公告)号:US10402484B2

    公开(公告)日:2019-09-03

    申请号:US14354115

    申请日:2011-10-27

    IPC分类号: G06F17/00 G06F17/24 G06F17/30

    摘要: Methods and systems of aligning annotation of fields of documents are provided. Training information that includes first measurement information pertaining to features of each of a plurality of fields associated with training clusters for documents of a document type is accessed. A first training cluster is annotated with a first name and the second training cluster is annotated with a second name. An electronic classification model is generated based on the training information. Second measurement information for features of fields associated with new clusters of a new document is accessed. Each of the new clusters is automatically annotated based on the second measurement information using the classification model. For example, a first new cluster that has fields of the first field type is annotated with the first name and a second new cluster that has fields of the second field type is annotated with the second name.