-
公开(公告)号:US10860299B2
公开(公告)日:2020-12-08
申请号:US16384691
申请日:2019-04-15
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US20180341651A1
公开(公告)日:2018-11-29
申请号:US16018777
申请日:2018-06-26
Applicant: Palantir Technologies, Inc.
Inventor: Robert Fink , Lynn Cuthriell , Adam Anderson , Adam Borochoff , Catherine Lu , Joseph Rafidi , Karanveer Mohan , Matthew Jenny , Matthew Maclean , Michelle Guo , Parvathy Menon , Ryan Rowe
IPC: G06F17/30
Abstract: A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
-
公开(公告)号:US09946738B2
公开(公告)日:2018-04-17
申请号:US15287715
申请日:2016-10-06
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
IPC: G06F17/30
CPC classification number: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US20170357648A1
公开(公告)日:2017-12-14
申请号:US15262207
申请日:2016-09-12
Applicant: Palantir Technologies, Inc.
Inventor: Robert Fink , Lynn Cuthriell , Adam Anderson
IPC: G06F17/30
CPC classification number: G06F17/3023 , G06F17/30194 , G06F17/30309 , G06F17/30377
Abstract: A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
-
公开(公告)号:US11573776B1
公开(公告)日:2023-02-07
申请号:US17091912
申请日:2020-11-06
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US20220058163A1
公开(公告)日:2022-02-24
申请号:US17463345
申请日:2021-08-31
Applicant: PALANTIR TECHNOLOGIES INC.
Inventor: Robert Fink , Lynn Cuthriell , Adam Anderson , Adam Borochoff , Catherine Lu , Joseph Rafidi , Karanveer Mohan , Matthew Jenny , Matthew Maclean , Michelle Guo , Parvathy Menon , Ryan Rowe
IPC: G06F16/18 , G06F16/182 , G06F16/21 , G06F16/23
Abstract: A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
-
公开(公告)号:US20190138508A1
公开(公告)日:2019-05-09
申请号:US16240507
申请日:2019-01-04
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US20180196838A1
公开(公告)日:2018-07-12
申请号:US15914215
申请日:2018-03-07
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
IPC: G06F17/30
CPC classification number: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US10007674B2
公开(公告)日:2018-06-26
申请号:US15262207
申请日:2016-09-12
Applicant: Palantir Technologies, Inc.
Inventor: Robert Fink , Lynn Cuthriell , Adam Anderson , Adam Borochoff , Catherine Lu , Joseph Rafidi , Karanveer Mohan , Matthew Jenny , Matthew Maclean , Michelle Guo , Parvathy Menon , Ryan Rowe
IPC: G06F17/30
CPC classification number: G06F16/1873 , G06F16/182 , G06F16/219 , G06F16/2379
Abstract: A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
-
公开(公告)号:US09483506B2
公开(公告)日:2016-11-01
申请号:US14879916
申请日:2015-10-09
Applicant: Palantir Technologies, Inc.
Inventor: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
CPC classification number: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
Abstract: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
Abstract translation: 一种维护数据流水线计算机系统和方法的历史。 在一个方面,历史保存数据流水线系统提供不变的和版本化的数据集。 因为数据集是不可变的和版本化的,所以系统可以在过去的某个时间点确定数据集中的数据,即使该数据不再在数据集的当前版本中。
-
-
-
-
-
-
-
-
-