Machine Learning Based Document Visual Element Extraction

    公开(公告)号:US20230419020A1

    公开(公告)日:2023-12-28

    申请号:US17808293

    申请日:2022-06-22

    Applicant: Google LLC

    CPC classification number: G06F40/174 G06V30/412 G06V30/19

    Abstract: A method includes obtaining a document with textual fields and a visual element. For each textual field, the method includes determining a textual offset for the textual field that indicates a location of the textual field relative to each other textual field in the document. The method includes detecting, using a machine learning vision model, the visual element and determining a visual element offset indicating a location of the visual element relative to each textual field in the document. The method includes assigning the visual element a visual element anchor token and inserting the visual element anchor token into the textual fields in an order based on the visual element offset and the respective textual offsets. The method also includes, after inserting the visual element anchor token, extracting, using a text-based extraction model, from the textual fields, structured entities representing the series of textual fields and the visual element.

Patent Agency Ranking