Editor for generating computational graphs

    公开(公告)号:US12050606B2

    公开(公告)日:2024-07-30

    申请号:US18112958

    申请日:2023-02-22

    摘要: Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.

    Processing queries containing a union-type operation

    公开(公告)号:US10437819B2

    公开(公告)日:2019-10-08

    申请号:US14746188

    申请日:2015-06-22

    摘要: Among other things, a method of generating a computer program based on an SQL query includes receiving a SQL query, including a reference to a first data set stored at a first data source, and including a reference to a second data set stored at a second data source different from the first data source, determining that the SQL query includes two or more commands, the commands including a first union-type operation, and a first aggregation operation, and determining that the SQL query describes that the first union-type operation shall be applied to at least a portion of data from the first data set, and applied to at least a portion of data from the second data set, determining that the SQL query describes that the first aggregation operation shall be applied to data resulting from the first union-type operation, and generating the computer program.

    Managing data queries
    5.
    发明授权
    Managing data queries 有权
    管理数据查询

    公开(公告)号:US09576028B2

    公开(公告)日:2017-02-21

    申请号:US14628643

    申请日:2015-02-23

    IPC分类号: G06F17/30

    摘要: In one aspect, in general, a method of generating a dataflow graph representing a database query includes receiving a query plan from a plan generator, the query plan representing operations for executing a database query on at least one input representing a source of data, producing a dataflow graph from the query plan, wherein the dataflow graph includes at least one node that represents at least one operation represented by the query plan, and includes at least one link that represents at least one dataflow associated with the query plan, and altering one or more components of the dataflow graph based on at least one characteristic of the at least one input representing the source of data.

    摘要翻译: 一方面,一般来说,生成表示数据库查询的数据流图的方法包括从计划生成器接收查询计划,所述查询计划表示用于对表示数据源的至少一个输入执行数据库查询的操作,产生 来自所述查询计划的数据流图,其中所述数据流图包括表示由所述查询计划表示的至少一个操作的至少一个节点,并且包括表示与所述查询计划相关联的至少一个数据流的至少一个链接,并且改变一个 基于表示数据源的至少一个输入的至少一个特性,数据流图的多个或多个组件。

    Editor for generating computational graphs

    公开(公告)号:US11593380B2

    公开(公告)日:2023-02-28

    申请号:US16862821

    申请日:2020-04-30

    摘要: Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.

    Managing data queries
    7.
    发明授权

    公开(公告)号:US11593369B2

    公开(公告)日:2023-02-28

    申请号:US15496891

    申请日:2017-04-25

    摘要: One method includes receiving a database query, receiving information about a database table in data storage populated with data elements, producing a structural representation of the database table that includes a formatted data organization reflective of the database table and is absent the data elements of the database table, and providing the structural representation and the database query to a plan generator capable of producing a query plan representing operations for executing the database query on the database table. Another method includes receiving a query plan from a plan generator, the plan representing operations for executing a database query on a database table, and producing a dataflow graph from the query plan, wherein the dataflow graph includes at least one node that represents at least one operation represented by the query plan, and includes at least one link that represents at least one dataflow associated with the query plan.

    GENERATION OF OPTIMIZED LOGIC FROM A SCHEMA

    公开(公告)号:US20210279043A1

    公开(公告)日:2021-09-09

    申请号:US17025751

    申请日:2020-09-18

    摘要: A method includes accessing a schema that specifies relationships among datasets, computations on the datasets, or transformations of the datasets, selecting a dataset from among the datasets, and identifying, from the schema, other datasets that are related to the selected dataset. Attributes of the datasets are identified, and logical data representing the identified attributes and relationships among the attributes is generated. The logical data is provided to a development environment, which provides access to portions of the logical data representing the identified attributes. A specification that specifies at least one of the identified attributes in performing an operation is received from the development environment. Based on the specification and the relationships among the identified attributes represented by the logical data, a computer program is generated to perform the operation by accessing, from storage, at least one dataset having the at least one of the attributes specified in the specification.

    EDITOR FOR GENERATING COMPUTATIONAL GRAPHS

    公开(公告)号:US20210232579A1

    公开(公告)日:2021-07-29

    申请号:US16862821

    申请日:2020-04-30

    摘要: Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.

    Processing data from multiple sources

    公开(公告)号:US09607073B2

    公开(公告)日:2017-03-28

    申请号:US14255579

    申请日:2014-04-17

    IPC分类号: G06F17/30 G06F9/50

    摘要: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.