-
1.
公开(公告)号:US20240177515A1
公开(公告)日:2024-05-30
申请号:US17994977
申请日:2022-11-28
Applicant: SAP SE
Inventor: SOHYEONG KIM , Xiang YU
IPC: G06V30/414 , G06V30/19
CPC classification number: G06V30/414 , G06V30/19007
Abstract: Embodiments are described for a system comprising a memory and at least one processor coupled to the memory. The at least one processor is configured to receive optical character recognition (OCR) information of a document and determine a beginning, inside, and outside (BIO) tags and labels of the one or more word boxes based on the OCR information. The at least one processor is further configured to group a first word box and a second word box based on BIO tags of the first and the second word boxes and merge the first and the second word boxes into a combined word box based on a label of the first word box matching a label of the second word box. Finally, the at least one processor is configured to output the combined word box and the label of the first word box.