- Patent Title: Dynamically performing data processing in a data pipeline system
-
Application No.: US16208435Application Date: 2018-12-03
-
Publication No.: US11314698B2Publication Date: 2022-04-26
- Inventor: Hao Dang , Gustav Brodman , Yi Xue , Stacey Milspaw , Yifei Huang , Yanran Lu
- Applicant: Palantir Technologies, Inc.
- Applicant Address: US CA Palo Alto
- Assignee: Palantir Technologies, Inc.
- Current Assignee: Palantir Technologies, Inc.
- Current Assignee Address: US CA Palo Alto
- Agency: Hickman Becker Bingham Ledesma LLP
- Main IPC: G06F16/182
- IPC: G06F16/182 ; G06F16/2455 ; G06F16/25 ; G06F16/23 ; G06F9/455

Abstract:
Techniques for automatically scheduling builds of derived datasets in a distributed database system that supports pipelined data transformations are described herein. In an embodiment, a data processing method comprises, in association with a distributed database system that implements one or more data transformation pipelines, each of the data transformation pipelines comprising at least a first dataset, a first transformation, a second derived dataset and dataset dependency and timing metadata, detecting an arrival of a new raw dataset or new derived dataset; in response to the detecting, obtaining from the dataset dependency and timing metadata a dataset subset comprising those datasets that depend on at least the new raw dataset or new derived dataset; for each member dataset in the dataset subset, determining if the member dataset has a dependency on any other dataset that is not yet arrived, and in response to determining that the member dataset does not have a dependency on any other dataset that is not yet arrived: initiating a build of a portion of the data transformation pipeline comprising the member dataset and all other datasets on which the member dataset is dependent, without waiting for arrival of other datasets.
Public/Granted literature
- US20190114289A1 DYNAMICALLY PERFORMING DATA PROCESSING IN A DATA PIPELINE SYSTEM Public/Granted day:2019-04-18
Information query