-
公开(公告)号:US12198459B2
公开(公告)日:2025-01-14
申请号:US17534744
申请日:2021-11-24
Applicant: Adobe Inc.
Inventor: Natwar Modani , Vaidehi Ramesh Patil , Inderjeet Jayakumar Nair , Gaurav Verma , Anurag Maurya , Anirudh Kanfade
IPC: G06K9/34 , G06V30/19 , G06V30/262 , G06V30/413 , G06V30/414 , G06V30/418
Abstract: In implementations of systems for generating indications of relationships between electronic documents, a processing device implements a relationship system to segment text of electronic documents included in a document corpus into segments. The relationship system determines a subset of the electronic documents that includes electronic document pairs having a number of similar segments that is greater than a threshold number. The similar segments are identified using locality sensitive hashing. The electronic document pairs are classified as related documents or unrelated documents using a machine learning model that receives a pair of electronic documents as an input and generates an indication of a classification for the pair of electronic documents as an output. Indications of relationships between particular electronic documents included in the subset are generated based at least partially on the electronic document pairs that are classified as related documents.
-
公开(公告)号:US20230162518A1
公开(公告)日:2023-05-25
申请号:US17534744
申请日:2021-11-24
Applicant: Adobe Inc.
Inventor: Natwar Modani , Vaidehi Ramesh Patil , Inderjeet Jayakumar Nair , Gaurav Verma , Anurag Maurya , Anirudh Kanfade
IPC: G06V30/413 , G06V30/262 , G06V30/414 , G06V30/418
CPC classification number: G06V30/413 , G06V30/274 , G06V30/414 , G06V30/418
Abstract: In implementations of systems for generating indications of relationships between electronic documents, a processing device implements a relationship system to segment text of electronic documents included in a document corpus into segments. The relationship system determines a subset of the electronic documents that includes electronic document pairs having a number of similar segments that is greater than a threshold number. The similar segments are identified using locality sensitive hashing. The electronic document pairs are classified as related documents or unrelated documents using a machine learning model that receives a pair of electronic documents as an input and generates an indication of a classification for the pair of electronic documents as an output. Indications of relationships between particular electronic documents included in the subset are generated based at least partially on the electronic document pairs that are classified as related documents.
-