-
公开(公告)号:US10860299B2
公开(公告)日:2020-12-08
申请号:US16384691
申请日:2019-04-15
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US20230185546A1
公开(公告)日:2023-06-15
申请号:US18165780
申请日:2023-02-07
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: A computer-implemented method comprises obtaining a first build task for building first source code in a first programming language of a plurality of programming languages; retrieving, by the processor, the first source code based on the first build task; building the first source code into one or more artifacts and one or more job specifications; storing the one or more artifacts in a cache shared across a cluster; and initializing an application module on the cluster based on the first programming language, the application module configured to receive a job specification of the one or more job specifications and execute a data transformation job using a reference to a location in the cache.
-
公开(公告)号:US10261763B2
公开(公告)日:2019-04-16
申请号:US15839680
申请日:2017-12-12
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US09576015B1
公开(公告)日:2017-02-21
申请号:US14874690
申请日:2015-10-05
Applicant: Palantir Technologies, Inc.
Inventor: David Tolnay , Punyashloka Biswal , Andrew Colombi , Yupeng Fu , Ashar Fuadi , Mingyu Kim , Paul Nepywoda , Akshay Pundle , Juan Tamayo
IPC: G06F17/30
CPC classification number: G06F17/30569 , G06F17/30339 , G06F17/30345 , G06F17/30457 , G06F17/30563 , G06F17/30958 , G06F17/30961
Abstract: Techniques related to a domain-specific language for dataset transformations are disclosed. A server computer may process a table definition composed in a domain-specific language. The table definition may include a sequence of one or more dataset transformations to be performed on one or more source tables to generate a target table. The sequence may include a customized transformation. A source dataset may be provided as input to an implementation of the customized transformation. An output dataset may be generated as a result of executing the implementation. An intermediate table may be generated based on performing at least one dataset transformation on a particular source table. A supplemental portion for the intermediate table may be generated based on performing the at least one dataset transformation on an appended portion of the particular source table. The target table may be generated based on combining the supplemental portion with the intermediate table.
Abstract translation: 公开了与数据集转换的领域专用语言相关的技术。 服务器计算机可以处理以域特定语言组成的表定义。 表定义可以包括要在一个或多个源表上执行以生成目标表的一个或多个数据集变换的序列。 序列可以包括定制的变换。 源数据集可以被提供为定制变换的实现的输入。 作为执行实现的结果,可以生成输出数据集。 可以基于对特定源表执行至少一个数据集变换来生成中间表。 可以基于在特定源表的附加部分上执行至少一个数据集变换来生成中间表的补充部分。 可以基于将补充部分与中间表组合来生成目标表。
-
公开(公告)号:US11573776B1
公开(公告)日:2023-02-07
申请号:US17091912
申请日:2020-11-06
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US12061884B2
公开(公告)日:2024-08-13
申请号:US18165780
申请日:2023-02-07
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: A computer-implemented method comprises obtaining a first build task for building first source code in a first programming language of a plurality of programming languages; retrieving, by the processor, the first source code based on the first build task; building the first source code into one or more artifacts and one or more job specifications; storing the one or more artifacts in a cache shared across a cluster; and initializing an application module on the cluster based on the first programming language, the application module configured to receive a job specification of the one or more job specifications and execute a data transformation job using a reference to a location in the cache.
-
公开(公告)号:US11080296B2
公开(公告)日:2021-08-03
申请号:US15913721
申请日:2018-03-06
Applicant: Palantir Technologies, Inc.
Inventor: David Tolnay , Punyashloka Biswal , Andrew Colombi , Yupeng Fu , Ashar Fuadi , Mingyu Kim , Paul Nepywoda , Akshay Pundle , Juan Tamayo
IPC: G06F16/20 , G06F16/25 , G06F16/23 , G06F16/22 , G06F16/901 , G06F16/2453
Abstract: Techniques related to a domain-specific language for transformations are disclosed. A server computer may process a table definition composed in a domain-specific language. The table definition may include a sequence of one or more transformations to be performed on one or more source tables to generate a target table. The sequence may include a customized transformation. A source dataset may be provided as input to an implementation of the customized transformation. An output dataset may be generated as a result of executing the implementation. An intermediate table may be generated based on performing at least one transformation on a particular source table. A supplemental portion for the intermediate table may be generated based on performing the at least one transformation on an appended portion of the particular source table. The target table may be generated based on combining the supplemental portion with the intermediate table.
-
公开(公告)号:US20180165072A1
公开(公告)日:2018-06-14
申请号:US15839680
申请日:2017-12-12
Applicant: Palantir Technologies Inc.
Inventor: Robert Fink , Matthew Cheah , Mingyu Kim , Lynn Cuthriell , Divyanshu Arora , Justin Uang , Jared Newman , Jakob Juelich , Kevin Chen , Mark Elliot , Michael Nazario
Abstract: Data transformation in a distributed system of applications and data repositories is described. The subsystems for the overall framework are distributed, thereby allowing for customization to require only isolated changes to one or more subsystems. In one embodiment, a source code repository is used to receive and store source code. A build subsystem can retrieve source code from the source code repository and build it, using one or more criteria. By building the source code, the build subsystem can generate an artifact, which is executable code, such as a JAR or SQL file. Likewise, by building the source code, the build subsystem can generate one or more job specifications for executing the executable code. In one embodiment, the artifact and job specification may be used to launch an application server in a cluster. The application server can then receive data transformation instructions and execute the data transformation instructions.
-
公开(公告)号:US09965534B2
公开(公告)日:2018-05-08
申请号:US15369753
申请日:2016-12-05
Applicant: Palantir Technologies, Inc.
Inventor: David Tolnay , Punyashloka Biswal , Andrew Colombi , Yupeng Fu , Ashar Fuadi , Mingyu Kim , Paul Nepywoda , Akshay Pundle , Juan Tamayo
IPC: G06F17/30
CPC classification number: G06F17/30569 , G06F17/30339 , G06F17/30345 , G06F17/30457 , G06F17/30563 , G06F17/30958 , G06F17/30961
Abstract: Techniques related to a domain-specific language for dataset transformations are disclosed. A server computer may process a table definition composed in a domain-specific language. The table definition may include a sequence of one or more dataset transformations to be performed on one or more source tables to generate a target table. The sequence may include a customized transformation. A source dataset may be provided as input to an implementation of the customized transformation. An output dataset may be generated as a result of executing the implementation. An intermediate table may be generated based on performing at least one dataset transformation on a particular source table. A supplemental portion for the intermediate table may be generated based on performing the at least one dataset transformation on an appended portion of the particular source table. The target table may be generated based on combining the supplemental portion with the intermediate table.
-
-
-
-
-
-
-
-