Data processing pipeline failure recovery
Abstract:
Techniques are disclosed for re-executing a data processing pipeline following a failure of at least one of its components. The techniques may include a syntax for defining a compute graph associated with the data processing pipeline and receiving such a compute graph in association with a specific data processing pipeline. The technique may include executing the data processing pipeline, determining that a component of the data processing pipeline failed, and determining a portion of the data processing pipeline to execute/re-execute based at least in part on dependencies defined by the data processing pipeline in association with the failed component. Re-executing the one or more components may comprise retrieving an output saved in association with a component upon which the failed component depends.
Public/Granted literature
Information query
Patent Agency Ranking
0/0