-
公开(公告)号:US20230185546A1
公开(公告)日:2023-06-15
申请号:US18165780
申请日:2023-02-07
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: A computer-implemented method comprises obtaining a first build task for building first source code in a first programming language of a plurality of programming languages; retrieving, by the processor, the first source code based on the first build task; building the first source code into one or more artifacts and one or more job specifications; storing the one or more artifacts in a cache shared across a cluster; and initializing an application module on the cluster based on the first programming language, the application module configured to receive a job specification of the one or more job specifications and execute a data transformation job using a reference to a location in the cache.
-
公开(公告)号:US10261763B2
公开(公告)日:2019-04-16
申请号:US15839680
申请日:2017-12-12
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US10191926B2
公开(公告)日:2019-01-29
申请号:US15914215
申请日:2018-03-07
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US12061884B2
公开(公告)日:2024-08-13
申请号:US18165780
申请日:2023-02-07
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: A computer-implemented method comprises obtaining a first build task for building first source code in a first programming language of a plurality of programming languages; retrieving, by the processor, the first source code based on the first build task; building the first source code into one or more artifacts and one or more job specifications; storing the one or more artifacts in a cache shared across a cluster; and initializing an application module on the cluster based on the first programming language, the application module configured to receive a job specification of the one or more job specifications and execute a data transformation job using a reference to a location in the cache.
-
公开(公告)号:US10853338B2
公开(公告)日:2020-12-01
申请号:US16240507
申请日:2019-01-04
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US20180165072A1
公开(公告)日:2018-06-14
申请号:US15839680
申请日:2017-12-12
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US20170097950A1
公开(公告)日:2017-04-06
申请号:US15287715
申请日:2016-10-06
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
IPC: G06F17/30
CPC classification number: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US20160125000A1
公开(公告)日:2016-05-05
申请号:US14879916
申请日:2015-10-09
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
CPC classification number: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
Abstract translation: 一种维护数据流水线计算机系统和方法的历史。 在一个方面,历史保存数据流水线系统提供不变的和版本化的数据集。 因为数据集是不可变的和版本化的,所以系统可以在过去的某个时间点确定数据集中的数据,即使该数据不再在数据集的当前版本中。
-
公开(公告)号:US11841835B2
公开(公告)日:2023-12-12
申请号:US17463345
申请日:2021-08-31
Applicant: PALANTIR TECHNOLOGIES INC.
Inventor: Robert Fink , Lynn Cuthriell , Adam Anderson , Adam Borochoff , Catherine Lu , Joseph Rafidi , Karanveer Mohan , Matthew Jenny , Matthew Maclean , Michelle Guo , Parvathy Menon , Ryan Rowe
IPC: G06F16/18 , G06F16/182 , G06F16/21 , G06F16/23
CPC classification number: G06F16/1873 , G06F16/182 , G06F16/219 , G06F16/2379
Abstract: A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
-
公开(公告)号:US11106638B2
公开(公告)日:2021-08-31
申请号:US16018777
申请日:2018-06-26
Applicant: Palantir Technologies, Inc.
Inventor: Robert Fink , Lynn Cuthriell , Adam Anderson , Adam Borochoff , Catherine Lu , Joseph Rafidi , Karanveer Mohan , Matthew Jenny , Matthew Maclean , Michelle Guo , Parvathy Menon , Ryan Rowe
IPC: G06F16/18 , G06F16/182 , G06F16/21 , G06F16/23
Abstract: A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
-
-
-
-
-
-
-
-
-