ACCESSING AN EXTERNAL TABLE IN PARALLEL TO EXECUTE A QUERY
    1.
    发明申请
    ACCESSING AN EXTERNAL TABLE IN PARALLEL TO EXECUTE A QUERY 有权
    访问外部表并行执行查询

    公开(公告)号:US20150356131A1

    公开(公告)日:2015-12-10

    申请号:US14685840

    申请日:2015-04-14

    CPC classification number: G06F17/30339 G06F17/30424 G06F17/30445

    Abstract: An approach, referred to herein as parallelized-external-table access, generates rows from a single external table in parallel for a given query. Under parallelized-external-table access, an execution plan generated for the query includes multiple work granules that generate rows for a single external table from a data source. Such work granules are referred to herein as external work granules. Each external work granule of the execution plan may be assigned to a slave process, which executes the external work granule in parallel with another slave process executing another external work granule. External tables are accessible on a cluster of data nodes in a distributed data access system (e.g. Hadoop Distributed File System) connected to a DBMS.

    Abstract translation: 在本文中称为并行化 - 外部表访问的方法针对给定查询从并行的单个外部表生成行。 在并行化 - 外部表访问下,为查询生成的执行计划包括多个工作颗粒,可从数据源为单个外部表生成行。 这种工作颗粒在本文中称为外部工作颗粒。 可以将执行计划的每个外部工作颗粒分配给从属进程,该从属进程与执行另一个外部工作颗粒的另一个从属进程并行执行外部工作颗粒。 外部表可以在连接到DBMS的分布式数据访问系统(例如Hadoop分布式文件系统)中的数据节点集群上访问。

    Storage-Side Scanning on Non-Natively Formatted Data
    3.
    发明申请
    Storage-Side Scanning on Non-Natively Formatted Data 审中-公开
    非本地格式化数据的存储侧扫描

    公开(公告)号:US20150356158A1

    公开(公告)日:2015-12-10

    申请号:US14733691

    申请日:2015-06-08

    Abstract: A storage system communicatively coupled to a DBMS performs storage-side scanning of data sources that are not stored in the native database storage format of the DBMS. Data sources for external tables are accessible in a storage system referred to herein as a distributed data access system, e.g. a Hadoop Distributed File System. To execute a query that references an external table, a DBMS first generates an execution plan. The distributed data access system supplies the DBMS with information that specifies each portion of the data source, and specifies which data node to use to access the portion. The DBMS sends a request for each portion to the respective data node, the request requesting that the data node generate rows from data in the portion. The request may specify scanning criteria, specifying one or more columns to project and/or filter on. The request may also specify code modules for the data node to execute to generate rows or records and columns.

    Abstract translation: 通信地耦合到DBMS的存储系统对不存储在DBMS的本地数据库存储格式的数据源执行存储侧扫描。 用于外部表的数据源可在本文称为分布式数据访问系统的存储系统中访问,例如, 一个Hadoop分布式文件系统。 要执行引用外部表的查询,DBMS首先生成执行计划。 分布式数据访问系统向DBMS提供指定数据源的每个部分的信息,并指定要用于访问该部分的数据节点。 DBMS向每个数据节点发送每个部分的请求,该请求请求数据节点从该部分中的数据生成行。 请求可以指定扫描条件,指定一个或多个列进行投影和/或过滤。 该请求还可以指定用于数据节点执行的代码模块以生成行或记录和列。

    Accessing an external table in parallel to execute a query

    公开(公告)号:US10019473B2

    公开(公告)日:2018-07-10

    申请号:US14685840

    申请日:2015-04-14

    CPC classification number: G06F16/2282 G06F16/245 G06F16/24532

    Abstract: An approach, referred to herein as parallelized-external-table access, generates rows from a single external table in parallel for a given query. Under parallelized-external-table access, an execution plan generated for the query includes multiple work granules that generate rows for a single external table from a data source. Such work granules are referred to herein as external work granules. Each external work granule of the execution plan may be assigned to a slave process, which executes the external work granule in parallel with another slave process executing another external work granule. External tables are accessible on a cluster of data nodes in a distributed data access system (e.g. Hadoop Distributed File System) connected to a DBMS.

Patent Agency Ranking