-
公开(公告)号:US20230419020A1
公开(公告)日:2023-12-28
申请号:US17808293
申请日:2022-06-22
Applicant: Google LLC
Inventor: Nikolay Glushnev , Qingze Wang , Emmanouil Koukoumidis , Henry Wahyudi Setiawan , Lauro Ivo Beltrao Colaco Costa , Vincent Perot
IPC: G06F40/174 , G06V30/412 , G06V30/19
CPC classification number: G06F40/174 , G06V30/412 , G06V30/19
Abstract: A method includes obtaining a document with textual fields and a visual element. For each textual field, the method includes determining a textual offset for the textual field that indicates a location of the textual field relative to each other textual field in the document. The method includes detecting, using a machine learning vision model, the visual element and determining a visual element offset indicating a location of the visual element relative to each textual field in the document. The method includes assigning the visual element a visual element anchor token and inserting the visual element anchor token into the textual fields in an order based on the visual element offset and the respective textual offsets. The method also includes, after inserting the visual element anchor token, extracting, using a text-based extraction model, from the textual fields, structured entities representing the series of textual fields and the visual element.