- 专利标题: Systems and methods for automatic clustering and canonical designation of related data in various data structures
-
申请号: US17812984申请日: 2022-07-15
-
公开(公告)号: US11704325B2公开(公告)日: 2023-07-18
- 发明人: Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin
- 申请人: Palantir Technologies Inc.
- 申请人地址: US CA Palo Alto
- 专利权人: Palantir Technologies Inc.
- 当前专利权人: Palantir Technologies Inc.
- 当前专利权人地址: US CO Denver
- 代理机构: Knobbe, Martens, Olson & Bear, LLP
- 主分类号: G06F16/00
- IPC分类号: G06F16/00 ; G06F16/2457 ; G06F16/35 ; G06F16/9535 ; G06F16/28 ; G06F18/23
摘要:
Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.
公开/授权文献
信息查询