PARALLELIZING SQL ON DISTRIBUTED FILE SYSTEMS
    2.
    发明申请
    PARALLELIZING SQL ON DISTRIBUTED FILE SYSTEMS 审中-公开
    在分布式文件系统上并行SQL

    公开(公告)号:US20170011090A1

    公开(公告)日:2017-01-12

    申请号:US15114328

    申请日:2014-03-31

    IPC分类号: G06F17/30

    摘要: Example embodiments relate to parallelizing structured query language (SQL) on distributed file systems. In example embodiments, a subquery of a distributed file system is received from a query engine, where the subquery is one of multiple subqueries that are scheduled to execute on a cluster of server nodes. At this stage, a user defined function that comprises local, role-based functionality is executed, where the partitioned magic table triggers parallel execution of the user defined function. The execution of the UDF determines a sequence number based on a quantity of the cluster of server nodes and retrieve nonconsecutive chunks from a file of the distributed file system, where each of the nonconsecutive chunks is offset by the sequence number.

    摘要翻译: 示例性实施例涉及在分布式文件系统上并行化结构化查询语言(SQL)。 在示例实施例中,从查询引擎接收分布式文件系统的子查询,其中子查询是被调度为在服务器节点集群上执行的多个子查询之一。 在此阶段,执行包含本地,基于角色的功能的用户定义的功能,其中分区魔术表触发并行执行用户定义的功能。 UDF的执行基于服务器节点的簇的数量来确定序列号,并从分布式文件系统的文件中检索非连续的块,其中每个非连续的块被序列号偏移。

    Parallelizing SQL on distributed file systems

    公开(公告)号:US10534770B2

    公开(公告)日:2020-01-14

    申请号:US15114328

    申请日:2014-03-31

    摘要: Example embodiments relate to parallelizing structured query language (SQL) on distributed file systems. In example embodiments, a subquery of a distributed file system is received from a query engine, where the subquery is one of multiple subqueries that are scheduled to execute on a cluster of server nodes. At this stage, a user defined function that comprises local, role-based functionality is executed, where the partitioned magic table triggers parallel execution of the user defined function. The execution of the UDF determines a sequence number based on a quantity of the cluster of server nodes and retrieve nonconsecutive chunks from a file of the distributed file system, where each of the nonconsecutive chunks is offset by the sequence number.

    DATA STREAM PROCESSING BASED ON A BOUNDARY PARAMETER
    4.
    发明申请
    DATA STREAM PROCESSING BASED ON A BOUNDARY PARAMETER 审中-公开
    基于边界参数的数据流处理

    公开(公告)号:US20160253219A1

    公开(公告)日:2016-09-01

    申请号:US15032884

    申请日:2013-12-13

    IPC分类号: G06F9/52

    CPC分类号: G06F9/52 G06F17/18

    摘要: In one implementation, a system for processing a data stream can comprise a station engine, an execution engine, and a synchronize engine. A station engine can provide a stream operator to receive application logic, punctuate the data stream, and determine a number of input channels for parallel processing. The execution engine can perform a behavior of the application logic during a process operation. The synchronize engine can hold data of the data stream associated with a window until each input channel has reached a data boundary based on a boundary parameter.

    摘要翻译: 在一个实现中,用于处理数据流的系统可以包括站引擎,执行引擎和同步引擎。 站引擎可以提供流操作者来接收应用逻辑,标记数据流,并确定用于并行处理的多个输入通道。 执行引擎可以在进程操作期间执行应用程序逻辑的行为。 同步引擎可以保持与窗口相关联的数据流的数据,直到每个输入通道基于​​边界参数到达数据边界。

    Parallelizing SQL user defined transformation functions

    公开(公告)号:US10885031B2

    公开(公告)日:2021-01-05

    申请号:US15114913

    申请日:2014-03-10

    摘要: Example embodiments relate to parallelizing structured query language (SQL) user defined transformation functions. In example embodiments, a subquery of a query is received from a query engine, where each of the subqueries is associated with a distinct magic number in a magic table. A user defined transformation function that includes local, role-based functionality may then be executed, where the magic number triggers parallel execution of the user defined transformation function. At this stage, the results of the user defined transformation function are sent to the query engine, where the query engine unions the results with other results that are obtained from the other database nodes.

    ABSTRACTION LAYER BETWEEN A DATABASE QUERY ENGINE AND A DISTRIBUTED FILE SYSTEM
    6.
    发明申请
    ABSTRACTION LAYER BETWEEN A DATABASE QUERY ENGINE AND A DISTRIBUTED FILE SYSTEM 审中-公开
    数据库查询引擎与分布式文件系统之间的摘要层

    公开(公告)号:US20160267132A1

    公开(公告)日:2016-09-15

    申请号:US15033163

    申请日:2013-12-17

    IPC分类号: G06F17/30

    摘要: A system includes a distributed file system to control storage of data across storage nodes and a database query engine to receive a database query for access of data, the database query engine to process the database query using an index, and using a buffer pool to cache data retrieved in response to the database query and to store updated data. An abstraction layer is provided between the database query engine and the distributed file system, the abstraction layer to read and write data of the distributed file system in response to the database query.

    摘要翻译: 系统包括一个分布式文件系统,用于控制跨存储节点的数据存储和一个数据库查询引擎,用于接收数据库访问数据的数据库查询,数据库查询引擎使用索引处理数据库查询,并使用缓冲池进行缓存 响应于数据库查询检索的数据并存储更新的数据。 在数据库查询引擎和分布式文件系统之间提供抽象层,抽象层响应数据库查询读写分布式文件系统的数据。