Using Machine Learning to Estimate Query Resource Consumption in MPPDB

    公开(公告)号:US20180314735A1

    公开(公告)日:2018-11-01

    申请号:US15959442

    申请日:2018-04-23

    Abstract: Methods and apparatus are provided for using machine learning to estimate query resource consumption in a massively parallel processing database (MPPDB). In various embodiments, the machine learning may jointly perform query resource consumption estimation for a query and resource extreme events detection together, utilize an adaptive kernel that is configured to learn most optimal similarity relation metric for data from each system settings, and utilize multi-level stacking technology configured to leverage outputs of diverse base classifier models. Advantages and benefits of the disclosed embodiments include providing faster and more reliable system performance and avoiding resource issues such as out of memory (OOM) occurrences.

    Memory-aware plan negotiation in query concurrency control

    公开(公告)号:US10740332B2

    公开(公告)日:2020-08-11

    申请号:US15411713

    申请日:2017-01-20

    Abstract: Embodiments of the present technology relate managing database query concurrency. A method of the present technology can include receiving a query, generating a first query plan that can be used to execute the query in system memory without any system memory constraints, and estimating a system memory cost for executing the query in the system memory using the first query plan. The method can also include placing the query in a queue if available system memory does not satisfy the estimated system memory cost. The method can further include conditionally selecting the query from the queue, conditionally generating a second query plan for the query that can be used to execute the query in the system memory in compliance with a system memory constraint, and conditionally executing the query in the system memory.

    Systems and Methods for Parallelizing Hash-based Operators in SMP Databases
    5.
    发明申请
    Systems and Methods for Parallelizing Hash-based Operators in SMP Databases 审中-公开
    在SMP数据库中并行化基于哈希的运算符的系统和方法

    公开(公告)号:US20160378824A1

    公开(公告)日:2016-12-29

    申请号:US14749098

    申请日:2015-06-24

    CPC classification number: G06F16/24532 G06F16/2255

    Abstract: A system and method for parallelizing hash-based operators in symmetric multiprocessing (SMP) databases is provided. In an embodiment, a method in a device for performing hash based database operations includes receiving at the device an database query; creating a plurality of execution workers to process the query; and building by the execution workers a hash table from a database table, the database table comprising one of a plurality of partitions and a plurality of scan units, the hash table shared by the execution workers, each execution worker scanning a corresponding partition and adding entries to the hash table if the database table is partitioned, each execution worker scanning an unprocessed scan unit and adding entries to the hash table according to the scan unit if the database table comprises scan units, and the workers performing the scanning and the adding in a parallel manner.

    Abstract translation: 提供了一种用于在对称多处理(SMP)数据库中并行化基于散列算子的系统和方法。 在一个实施例中,用于执行基于散列的数据库操作的设备中的方法包括在所述设备处接收数据库查询; 创建多个执行人员来处理查询; 并且由执行工作者构建来自数据库表的散列表,所述数据库表包括多个分区和多个扫描单元之一,所述散列表由执行工作者共享,每个执行工作人员扫描相应的分区并添加条目 如果数据库表被分区,则每个执行人员扫描未处理的扫描单元,并且如果数据库表包括扫描单元,则根据扫描单元将条目添加到散列表,并且执行扫描和添加的工作人员 并行方式

    MEMORY-AWARE PLAN NEGOTIATION IN QUERY CONCURRENCY CONTROL

    公开(公告)号:US20180210916A1

    公开(公告)日:2018-07-26

    申请号:US15411713

    申请日:2017-01-20

    Abstract: Embodiments of the present technology relate managing database query concurrency. A method of the present technology can include receiving a query, generating a first query plan that can be used to execute the query in system memory without any system memory constraints, and estimating a system memory cost for executing the query in the system memory using the first query plan. The method can also include placing the query in a queue if available system memory does not satisfy the estimated system memory cost. The method can further include conditionally selecting the query from the queue, conditionally generating a second query plan for the query that can be used to execute the query in the system memory in compliance with a system memory constraint, and conditionally executing the query in the system memory.

    SYSTEM AND METHOD FOR DATA CACHING IN PROCESSING NODES OF A MASSIVELY PARALLEL PROCESSING (MPP) DATABASE SYSTEM
    7.
    发明申请
    SYSTEM AND METHOD FOR DATA CACHING IN PROCESSING NODES OF A MASSIVELY PARALLEL PROCESSING (MPP) DATABASE SYSTEM 有权
    用于数据处理的数字缓存系统和方法在大规模并行处理(MPP)数据库系统的处理中

    公开(公告)号:US20170010968A1

    公开(公告)日:2017-01-12

    申请号:US14794750

    申请日:2015-07-08

    Abstract: The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory usage information. Data pages are read from a memory of the processing nodes in response to receiving a request to fetch the data pages, and a remote memory manager is queried for available memory in each of the processing nodes in response to receiving the request. The data pages are distributed to the memory of the processing nodes having sufficient space available for storage during data processing.

    Abstract translation: 本技术涉及在大规模并行处理(MPP)数据库系统的处理节点中管理数据缓存。 维护一个包含MPP数据库系统中数据页的列表和存储位置的目录。 通过彼此交换内存使用信息,在处理节点中监视内存使用情况。 每个处理节点基于存储器使用信息管理每个处理节点中的列表和相应的可用存储器量。 响应于接收到提取数据页的请求,从处理节点的存储器读取数据页面,并且响应于接收到请求,在每个处理节点中查询远程存储器管理器以查找可用存储器。 在数据处理期间,将数据页分配给具有足够空间的处理节点的存储器。

    Methods and Systems for Dynamically Allocating Resources and Tasks Among Database Work Agents in an SMP Environment
    8.
    发明申请
    Methods and Systems for Dynamically Allocating Resources and Tasks Among Database Work Agents in an SMP Environment 审中-公开
    在SMP环境中动态分配资源和数据库工作代理任务的方法和系统

    公开(公告)号:US20150227586A1

    公开(公告)日:2015-08-13

    申请号:US14175489

    申请日:2014-02-07

    Abstract: Dynamically re-allocating tasks and/or memory quotas amongst work agents in symmetric multiprocessing (SMP) systems can significantly mitigate delays and inefficiencies associated with data skew. For example, unfinished tasks can be reallocated from a busy work agent to an idle work agent upon determining that the idle work agent has finished processing its originally assigned set of tasks. Alternatively, a portion of a memory quota assigned to an idle work agent can be reallocated to a busy work agent for use in processing the remaining tasks. Memory quotas can be re-assigned by releasing the memory quota back into a memory pool once the idle work agent has finished processing its originally assigned tasks, and then reallocating some or all of the memory quota to the busy work agent.

    Abstract translation: 对称多处理(SMP)系统中的工作代理之间的动态重新分配任务和/或内存配额可以显着地减轻与数据偏移相关的延迟和低效率。 例如,未确定的任务可以在确定空闲工作代理已经完成处理其原始分配的一组任务之后,从繁忙的工作代理重新分配到空闲工作代理。 或者,分配给空闲工作代理的内存配额的一部分可以被重新分配给忙工作代理,以用于处理剩余的任务。 一旦空闲工作代理完成处理其最初分配的任务,然后将部分或全部内存配额重新分配给忙工作代理,则可以通过将内存配额释放回内存池来重新分配内存配额。

Patent Agency Ranking