Patent search ap:("Palantir Technologies Inc.") AND inv:"Yifei Huang" Page 1

1.

发明公开
SYSTEMS AND METHODS FOR AUTOMATIC CLUSTERING AND CANONICAL DESIGNATION OF RELATED DATA IN VARIOUS DATA STRUCTURES 审中-公开

公开(公告)号：US20240320227A1

公开(公告)日：2024-09-26

申请号：US18731699

申请日：2024-06-03

Applicant: Palantir Technologies Inc.

Inventor： Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin

IPC: G06F16/2457 , G06F16/28 , G06F16/35 , G06F16/9535 , G06F18/23

CPC classification number: G06F16/24578 , G06F16/285 , G06F16/35 , G06F16/9535 , G06F18/23

Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

2.

发明授权
Systems and methods for automatic clustering and canonical designation of related data in various data structures 有权

公开(公告)号：US12038933B2

公开(公告)日：2024-07-16

申请号：US18325616

申请日：2023-05-30

Applicant: Palantir Technologies Inc.

Inventor： Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin

IPC: G06F16/00 , G06F16/2457 , G06F16/28 , G06F16/35 , G06F16/9535 , G06F18/23

CPC classification number: G06F16/24578 , G06F16/285 , G06F16/35 , G06F16/9535 , G06F18/23

Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

3.

发明公开
SYSTEMS AND METHODS FOR AUTOMATIC CLUSTERING AND CANONICAL DESIGNATION OF RELATED DATA IN VARIOUS DATA STRUCTURES 审中-公开

公开(公告)号：US20230297582A1

公开(公告)日：2023-09-21

申请号：US18325616

申请日：2023-05-30

Applicant: Palantir Technologies Inc.

Inventor： Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin

IPC: G06F16/2457 , G06F16/35 , G06F16/9535 , G06F16/28 , G06F18/23

CPC classification number: G06F16/24578 , G06F16/35 , G06F16/9535 , G06F16/285 , G06F18/23

Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

4.

发明授权
Dynamically performing data processing in a data pipeline system 有权

公开(公告)号：US10176217B1

公开(公告)日：2019-01-08

申请号：US15698574

申请日：2017-09-07

Applicant: Palantir Technologies, Inc.

Inventor： Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu

IPC: G06F17/30 , G06F9/455

Abstract: Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.

5.

发明申请
SYSTEMS AND METHODS FOR AUTOMATIC CLUSTERING AND CANONICAL DESIGNATION OF RELATED DATA IN VARIOUS DATA STRUCTURES 审中-公开
Title translation: 用于各种数据结构的相关数据的自动聚类和统一指定的系统和方法

公开(公告)号：US20170052958A1

公开(公告)日：2017-02-23

申请号：US15233149

申请日：2016-08-10

Applicant: Palantir Technologies Inc.

Inventor： Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin

IPC: G06F17/30

CPC classification number: G06F17/3053 , G06F17/30705 , G06F17/30867

Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

Abstract translation: 公开了计算机实现的系统和方法，用于自动聚类和规范地识别各种数据结构中的相关数据。数据结构可以包括多个记录，其中每个记录与相应实体相关联。根据一些实施例，系统和方法还包括通过将记录分组成对来识别与相应实体相关联的记录簇，分析相应的对以确定该对的两个成员与公共实体相关联的概率，以及识别一组重叠的对以生成与公共实体相关的记录集合。可以通过分析记录字段并识别相似性来进一步分析集群以确定相应实体的规范名称或其他属性。

6.

发明授权
Module expiration management 有权

公开(公告)号：US11995064B2

公开(公告)日：2024-05-28

申请号：US17469767

申请日：2021-09-08

Applicant: Palantir Technologies Inc.

Inventor： Jonathan Lafleche , Justin Uang , Onur Satici , Yifei Huang , Ovidiu-Dan Sanduleac , Lawrence Manning

IPC: G06F9/48 , G06F16/21 , G06F16/23

CPC classification number: G06F16/2365 , G06F9/485 , G06F16/219

Abstract: Systems, methods, and non-transitory computer readable media are provided for managing expiration of modules. An expiry dataset may be maintained. The expiry dataset may include a set of identifiers corresponding to a set of modules, a set of expiry values for the set of modules, and a set of termination tasks for the set of modules. A request to refresh a module may be received from a client. Responsive to the reception of the request, an expiry value and a termination task for the module within the expiry dataset may be updated. The expiry value may be independent of a timestamp associated with the request.

7.

发明申请
SYSTEMS AND METHODS FOR AUTOMATIC CLUSTERING AND CANONICAL DESIGNATION OF RELATED DATA IN VARIOUS DATA STRUCTURES 有权

公开(公告)号：US20220374454A1

公开(公告)日：2022-11-24

申请号：US17812984

申请日：2022-07-15

Applicant: Palantir Technologies Inc.

Inventor： Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin

IPC: G06F16/28 , G06K9/62

Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

8.

发明申请
SYSTEMS AND USER INTERFACES FOR DATA ANALYSIS INCLUDING ARTIFICIAL INTELLIGENCE ALGORITHMS FOR GENERATING OPTIMIZED PACKAGES OF DATA ITEMS 审中-公开

公开(公告)号：US20200004743A1

公开(公告)日：2020-01-02

申请号：US16570573

申请日：2019-09-13

Applicant: Palantir Technologies Inc.

Inventor： Yifei Huang , Grace Garde , Nikhita Singh , Sarah Gershkon , James Winchester , Laurynas Pliuskys

IPC: G06F16/248 , G06F16/2457 , G06N5/02

Abstract: Systems and user interfaces enable integration of data items from disparate sources to generate optimized packages of data items. For example, the systems described herein can obtain data items from various sources, score the data items, and present, via an interactive user interface, options for packaging the data items based on the scores. The systems may include artificial intelligence algorithms for selecting optimal combinations of data items for packaging. Further, the interactive user interfaces may enable a user to efficiently add data items to, and remove data items from, the data packages. The system may interactively re-calculate and update scores associated with the package of data items as the user interacts with the data package via the user interface. The systems and user interfaces may thus, according to various embodiments, enable the user to optimize the packages of data items based on multiple factors quickly and efficiently.

9.

发明申请
DYNAMICALLY PERFORMING DATA PROCESSING IN A DATA PIPELINE SYSTEM 审中-公开

公开(公告)号：US20190114289A1

公开(公告)日：2019-04-18

申请号：US16208435

申请日：2018-12-03

Applicant: Palantir Technologies, Inc.

Inventor： Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu

IPC: G06F16/182 , G06F9/455

Abstract: Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.

10.

发明申请
SYSTEMS AND METHODS FOR AUTOMATIC CLUSTERING AND CANONICAL DESIGNATION OF RELATED DATA IN VARIOUS DATA STRUCTURES 审中-公开

公开(公告)号：US20190079937A1

公开(公告)日：2019-03-14

申请号：US16189040

申请日：2018-11-13

Applicant: Palantir Technologies Inc.

Inventor： Lawrence Manning , Rahul Mehta , Daniel Erenrich , Guillem Palou Visa , Roger Hu , Xavier Falco , Rowan Gilmore , Eli Bingham , Jason Prestinario , Yifei Huang , Daniel Fernandez , Jeremy Elser , Clayton Sader , Rahul Agarwal , Matthew Elkherj , Nicholas Latourette , Aleksandr Zamoshchin

IPC: G06F17/30

Abstract: Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification