Unsupervised domain adaptation from generic forms for new OCR forms

    公开(公告)号:US11055560B2

    公开(公告)日:2021-07-06

    申请号:US16413244

    申请日:2019-05-15

    Abstract: The disclosed technology is generally directed to optical text recognition for forms. In one example of the technology, line grouping rules are generated based on the generic forms and a ground truth for the generic forms. Line groupings are applied to the generic forms based on the line grouping rules. Feature extraction rules are generated. Features are extracted from the generic forms based on the feature extraction rules. A key-value classifier model is generated, such that the key-value classifier model is configured to determine, for each line of a form: a probability that the line is a value, and a probability that the line is a key. A key-value pairing model is generated, such that the key-value pairing model is configured to predict, for each key in a form, which value in the form corresponds to the key.

Patent Agency Ranking