-
公开(公告)号:US20240257550A1
公开(公告)日:2024-08-01
申请号:US18686233
申请日:2022-08-25
Applicant: Google LLC
Inventor: Henri Rebecq , Federico Tombari , Diego Martin Arroyo
IPC: G06V30/416 , G06V10/44 , G06V10/82 , G06V30/412
CPC classification number: G06V30/416 , G06V10/44 , G06V10/82 , G06V30/412
Abstract: A method including receiving an image representing a document including a plurality of layout components, identifying textual information associated with the plurality of layout components, identifying visual information associated with the plurality of layout components, combining the textual information with the visual information, and predicting a reading order of the plurality of layout components based on the combined textual information and visual information using a self-attention encoder/decoder.