-
1.
公开(公告)号:US20240363207A1
公开(公告)日:2024-10-31
申请号:US18307481
申请日:2023-04-26
Applicant: Google LLC
Inventor: Itay Laish , Natan Potikha , Rachana Fellinger , Thidanun Saensuksopa , Eran Ofek , Ayelet Benjamini
IPC: G16H10/60 , G06F16/215 , G06F16/23
CPC classification number: G16H10/60 , G06F16/215 , G06F16/2365
Abstract: Systems and methods for detecting duplications in electronic record systems are provided. A computing system can include one or more processors and a non-transitory computer-readable memory that stores instructions that, when executed by the one or more processors, cause the computing system to perform operations including accessing one or more scanned documents; converting each document of the one or more scanned documents into one or more text streams; determining one or more characteristics of each document of the one or more scanned documents; responsive to determining the one or more characteristics, generating respective embeddings associated with each document of the one or more scanned documents; and determining a respective similarity score for each document of the one or more scanned documents based, at least in part, on a similarity metric.