-
公开(公告)号:US11704325B2
公开(公告)日:2023-07-18
申请号:US17812984
申请日:2022-07-15
Applicant: Palantir Technologies Inc.
Inventor: Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin
IPC: G06F16/00 , G06F16/2457 , G06F16/35 , G06F16/9535 , G06F16/28 , G06F18/23
CPC classification number: G06F16/24578 , G06F16/285 , G06F16/35 , G06F16/9535 , G06F18/23
Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.
-
12.
公开(公告)号:US11256762B1
公开(公告)日:2022-02-22
申请号:US15669035
申请日:2017-08-04
Applicant: Palantir Technologies Inc.
Inventor: Xiangnong Wang , Yifei Huang , Michael Yang , Francis Chen , Andy Chen , Andre Frederico Cavalheiro Menck , Christopher Yu , Grace Garde , Mark Cinali , James Winchester , Peter Wang , Nitish Kulkarni
IPC: G06F16/951 , G06Q30/02 , G06F16/957
Abstract: Various systems and methods for aggregating data from disparate sources to determine an optimal package of data items are disclosed. For example, the system described herein can obtain data items from various sources, aggregate and/or organize the data items into an optimal package based on various criteria, and present, via an interactive user interface, the optimal package. Furthermore, the interactive user interface may enable a user to adjust the criteria used to aggregate and/or organize the data items. The system may interactively re-aggregate and re-organize the data items using the adjusted criteria as the user interacts with the package via the user interface. The system and user interface may thus enable the user to optimize the packages of data items based on multiple factors quickly and efficiently.
-
公开(公告)号:US11216472B2
公开(公告)日:2022-01-04
申请号:US16570573
申请日:2019-09-13
Applicant: Palantir Technologies Inc.
Inventor: Yifei Huang , Grace Garde , Nikhita Singh , Sarah Gershkon , James Winchester , Laurynas Pliuskys
IPC: G06F16/2457 , G06F16/248 , G06N20/00 , G06N5/02
Abstract: Systems and user interfaces enable integration of data items from disparate sources to generate optimized packages of data items. For example, the systems described herein can obtain data items from various sources, score the data items, and present, via an interactive user interface, options for packaging the data items based on the scores. The systems may include artificial intelligence algorithms for selecting optimal combinations of data items for packaging. Further, the interactive user interfaces may enable a user to efficiently add data items to, and remove data items from, the data packages. The system may interactively re-calculate and update scores associated with the package of data items as the user interacts with the data package via the user interface. The systems and user interfaces may thus, according to various embodiments, enable the user to optimize the packages of data items based on multiple factors quickly and efficiently.
-
公开(公告)号:US20210406247A1
公开(公告)日:2021-12-30
申请号:US17469767
申请日:2021-09-08
Applicant: Palantir Technologies Inc.
Inventor: Jonathan Lafleche , Justin Uang , Onur Satici , Yifei Huang , Ovidiu-Dan Sanduleac , Lawrence Manning
Abstract: Systems, methods, and non-transitory computer readable media are provided for managing expiration of modules. An expiry dataset may be maintained. The expiry dataset may include a set of identifiers corresponding to a set of modules, a set of expiry values for the set of modules, and a set of termination tasks for the set of modules. A request to refresh a module may be received from a client. Responsive to the reception of the request, an expiry value and a termination task for the module within the expiry dataset may be updated. The expiry value may be independent of a timestamp associated with the request.
-
公开(公告)号:US11120007B2
公开(公告)日:2021-09-14
申请号:US16252363
申请日:2019-01-18
Applicant: Palantir Technologies Inc.
Inventor: Jonathan Lafleche , Justin Uang , Onur Satici , Yifei Huang , Ovidiu-Dan Sanduleac , Lawrence Manning
Abstract: Systems, methods, and non-transitory computer readable media are provided for managing expiration of modules. An expiry dataset may be maintained. The expiry dataset may include a set of identifiers corresponding to a set of modules, a set of expiry values for the set of modules, and a set of termination tasks for the set of modules. A request to refresh a module may be received from a client. Responsive to the reception of the request, an expiry value and a termination task for the module within the expiry dataset may be updated. The expiry value may be independent of a timestamp associated with the request.
-
公开(公告)号:US20200167333A1
公开(公告)日:2020-05-28
申请号:US16252363
申请日:2019-01-18
Applicant: Palantir Technologies Inc.
Inventor: Jonathan Lafleche , Justin Uang , Onur Satici , Yifei Huang , Ovidiu-Dan Sandulec , Lawrence Manning
Abstract: Systems, methods, and non-transitory computer readable media are provided for managing expiration of modules. An expiry dataset may be maintained. The expiry dataset may include a set of identifiers corresponding to a set of modules, a set of expiry values for the set of modules, and a set of termination tasks for the set of modules. A request to refresh a module may be received from a client. Responsive to the reception of the request, an expiry value and a termination task for the module within the expiry dataset may be updated. The expiry value may be independent of a timestamp associated with the request.
-
17.
公开(公告)号:US20220129508A1
公开(公告)日:2022-04-28
申请号:US17570271
申请日:2022-01-06
Applicant: Palantir Technologies Inc.
Inventor: Xiangnong Wang , Yifei Huang , Michael Yang , Francis Chen , Andy Chen , Andre Frederico Cavalheiro Menck , Christopher Yu , Grace Garde , Mark Cinali , James Winchester , Peter Wang , Nitish Kulkarni
IPC: G06F16/951 , G06F16/957 , G06Q30/02
Abstract: Various systems and methods for aggregating data from disparate sources to determine an optimal package of data items are disclosed. For example, the system described herein can obtain data items from various sources, aggregate and/or organize the data items into an optimal package based on various criteria, and present, via an interactive user interface, the optimal package. Furthermore, the interactive user interface may enable a user to adjust the criteria used to aggregate and/or organize the data items. The system may interactively re-aggregate and re-organize the data items using the adjusted criteria as the user interacts with the package via the user interface. The system and user interface may thus enable the user to optimize the packages of data items based on multiple factors quickly and efficiently.
-
公开(公告)号:US11314698B2
公开(公告)日:2022-04-26
申请号:US16208435
申请日:2018-12-03
Applicant: Palantir Technologies, Inc.
Inventor: Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu
IPC: G06F16/182 , G06F16/2455 , G06F16/25 , G06F16/23 , G06F9/455
Abstract: Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.
-
公开(公告)号:US10452673B1
公开(公告)日:2019-10-22
申请号:US15482443
申请日:2017-04-07
Applicant: Palantir Technologies Inc.
Inventor: Yifei Huang , Grace Garde , Nikhita Singh , Sarah Gershkon , James Winchester , Laurynas Pliuskys
IPC: G06F16/248 , G06N5/02 , G06F16/2457
Abstract: Systems and user interfaces enable integration of data items from disparate sources to generate optimized packages of data items. For example, the systems described herein can obtain data items from various sources, score the data items, and present, via an interactive user interface, options for packaging the data items based on the scores. The systems may include artificial intelligence algorithms for selecting optimal combinations of data items for packaging. Further, the interactive user interfaces may enable a user to efficiently add data items to, and remove data items from, the data packages. The system may interactively re-calculate and update scores associated with the package of data items as the user interacts with the data package via the user interface. The systems and user interfaces may thus, according to various embodiments, enable the user to optimize the packages of data items based on multiple factors quickly and efficiently.
-
公开(公告)号:US10127289B2
公开(公告)日:2018-11-13
申请号:US15233149
申请日:2016-08-10
Applicant: Palantir Technologies Inc.
Inventor: Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin
IPC: G06F17/30
Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.
-
-
-
-
-
-
-
-
-