Systems and methods for dataflow graph optimization

    公开(公告)号:US12032631B2

    公开(公告)日:2024-07-09

    申请号:US15993284

    申请日:2018-05-30

    CPC classification number: G06F16/9024 G06F16/2379 G06F16/2433

    Abstract: At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one computer hardware processor, cause the at least one computer hardware processor to perform: obtaining an automatically generated initial dataflow graph, the initial dataflow graph comprising a first plurality of nodes representing a first plurality of data processing operations and a first plurality of links representing flows of data among nodes in the first plurality of nodes; and generating an updated dataflow graph by iteratively applying dataflow graph optimization rules to update the initial dataflow graph, the updated dataflow graph comprising a second plurality of nodes representing a second plurality of data processing operations and a second plurality of links representing flows of data among nodes in the second plurality of nodes.

    SYSTEMS AND METHODS FOR PERFORMING DATA PROCESSING OPERATIONS USING VARIABLE LEVEL PARALLELISM

    公开(公告)号:US20250036478A1

    公开(公告)日:2025-01-30

    申请号:US18736974

    申请日:2024-06-07

    Abstract: Techniques for determining processing layouts to nodes of a dataflow graph. The techniques include: obtaining information specifying a dataflow graph, the dataflow graph comprising a plurality of nodes and a plurality of edges connecting the plurality nodes, the plurality of edges representing flows of data among nodes in the plurality of nodes, the plurality of nodes comprising: a first set of one or more nodes; and a second set of one or more nodes disjoint from the first set of nodes; obtaining a first set of one or more processing layouts for the first set of nodes; and determining a processing layout for each node in the second set of nodes based on the first set of processing layouts and one or more layout determination rules, the one or more layout determination rules including at least one rule for selecting among processing layouts having different degrees of parallelism, and information indicating that data generated by at least one node in the first and/or third set of nodes is not used by any nodes in the dataflow graph downstream from the at least one node.

    SYSTEMS AND METHODS FOR DATAFLOW GRAPH OPTIMIZATION

    公开(公告)号:US20240311427A1

    公开(公告)日:2024-09-19

    申请号:US18670461

    申请日:2024-05-21

    CPC classification number: G06F16/9024 G06F16/2379 G06F16/2433

    Abstract: At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one computer hardware processor, cause the at least one computer hardware processor to perform: obtaining an automatically generated initial dataflow graph, the initial dataflow graph comprising a first plurality of nodes representing a first plurality of data processing operations and a first plurality of links representing flows of data among nodes in the first plurality of nodes; and generating an updated dataflow graph by iteratively applying dataflow graph optimization rules to update the initial dataflow graph, the updated dataflow graph comprising a second plurality of nodes representing a second plurality of data processing operations and a second plurality of links representing flows of data among nodes in the second plurality of nodes.

    DATAFLOW GRAPH DATASETS
    4.
    发明公开

    公开(公告)号:US20230359668A1

    公开(公告)日:2023-11-09

    申请号:US18114212

    申请日:2023-02-24

    CPC classification number: G06F16/9024

    Abstract: Described herein are techniques, performed by a data processing system, for enabling efficient development of software application programs in a dynamic environment with multiple datasets by generating entries in a dataset catalog to provide a software application program with access to output data dynamically generated by dataflow graphs, the entries associated with respective software application programs developed as dataflow graphs. The techniques include identifying a subgraph, wherein, when the subgraph is executed, the subgraph generates output data by applying one or more data processing operations to data obtained from one or more data sources; creating, in the dataset catalog, a new entry associated with the identified subgraph, the new entry associated with information indicating nodes, links, and configuration parameters of the identified subgraph; and configuring the dataset catalog to enable access to the new entry, in the dataset catalog, associated with the identified subgraph.

    SYSTEMS AND METHODS FOR PERFORMING DATA PROCESSING OPERATIONS USING VARIABLE LEVEL PARALLELISM

    公开(公告)号:US20210182263A1

    公开(公告)日:2021-06-17

    申请号:US17079994

    申请日:2020-10-26

    Abstract: Techniques for determining processing layouts to nodes of a dataflow graph. The techniques include: obtaining information specifying a dataflow graph, the dataflow graph comprising a plurality of nodes and a plurality of edges connecting the plurality nodes, the plurality of edges representing flows of data among nodes in the plurality of nodes, the plurality of nodes comprising: a first set of one or more nodes; and a second set of one or more nodes disjoint from the first set of nodes; obtaining a first set of one or more processing layouts for the first set of nodes; and determining a processing layout for each node in the second set of nodes based on the first set of processing layouts and one or more layout determination rules, the one or more layout determination rules including at least one rule for selecting among processing layouts having different degrees of parallelism.

    Systems and methods for performing data processing operations using variable level parallelism

    公开(公告)号:US10817495B2

    公开(公告)日:2020-10-27

    申请号:US15939820

    申请日:2018-03-29

    Abstract: Techniques for determining processing layouts to nodes of a dataflow graph. The techniques include: obtaining information specifying a dataflow graph, the dataflow graph comprising a plurality of nodes and a plurality of edges connecting the plurality nodes, the plurality of edges representing flows of data among nodes in the plurality of nodes, the plurality of nodes comprising: a first set of one or more nodes; and a second set of one or more nodes disjoint from the first set of nodes; obtaining a first set of one or more processing layouts for the first set of nodes; and determining a processing layout for each node in the second set of nodes based on the first set of processing layouts and one or more layout determination rules, the one or more layout determination rules including at least one rule for selecting among processing layouts having different degrees of parallelism.

    SYSTEMS AND METHODS FOR PERFORMING DATA PROCESSING OPERATIONS USING VARIABLE LEVEL PARALLELISM

    公开(公告)号:US20230093911A1

    公开(公告)日:2023-03-30

    申请号:US17957646

    申请日:2022-09-30

    Abstract: Techniques for determining processing layouts to nodes of a dataflow graph. The techniques include: obtaining information specifying a dataflow graph, the dataflow graph comprising a plurality of nodes and a plurality of edges connecting the plurality nodes, the plurality of edges representing flows of data among nodes in the plurality of nodes, the plurality of nodes comprising: a first set of one or more nodes; and a second set of one or more nodes disjoint from the first set of nodes; obtaining a first set of one or more processing layouts for the first set of nodes; and determining a processing layout for each node in the second set of nodes based on the first set of processing layouts and one or more layout determination rules, the one or more layout determination rules including at least one rule for selecting among processing layouts having different degrees of parallelism, and information indicating that data generated by at least one node in the first and/or third set of nodes is not used by any nodes in the dataflow graph downstream from the at least one node.

    SYSTEMS AND METHODS FOR PERFORMING DATA PROCESSING OPERATIONS USING VARIABLE LEVEL PARALLELISM

    公开(公告)号:US20180285401A1

    公开(公告)日:2018-10-04

    申请号:US15939820

    申请日:2018-03-29

    Abstract: Techniques for determining processing layouts to nodes of a dataflow graph. The techniques include: obtaining information specifying a dataflow graph, the dataflow graph comprising a plurality of nodes and a plurality of edges connecting the plurality nodes, the plurality of edges representing flows of data among nodes in the plurality of nodes, the plurality of nodes comprising: a first set of one or more nodes; and a second set of one or more nodes disjoint from the first set of nodes; obtaining a first set of one or more processing layouts for the first set of nodes; and determining a processing layout for each node in the second set of nodes based on the first set of processing layouts and one or more layout determination rules, the one or more layout determination rules including at least one rule for selecting among processing layouts having different degrees of parallelism.

Patent Agency Ranking