Database group-by query cardinality estimation

    公开(公告)号:US12045233B2

    公开(公告)日:2024-07-23

    申请号:US17979643

    申请日:2022-11-02

    Applicant: SAP SE

    CPC classification number: G06F16/24537 G06F16/24545

    Abstract: Mechanisms are disclosed for estimating cardinality of group-by queries. A probability of occurrence of values is obtained for columns that satisfy the query occurring in tables from a trained machine learning model. A range selectivity is calculated based on a conditional probability of occurrence of the values. A set of valid generated sample tuples is generated from the trained machine learning model. A group-by selectivity is calculated by keeping the conditional probability of occurrence to obtain probabilities that a result set will have specific group-by column values associated with the tables while proceeding with progressive sampling. A sampling probability is calculated by normalizing the group-by selectivity by dividing the group-by selectivity by the range selectivity. The samples are filtered such that the samples having a sampling probability below a sampling probability threshold are filtered out. A sampling-based estimator is applied to the filtered samples set to estimate the cardinality.

    DATABASE GROUP-BY QUERY CARDINALITY ESTIMATION

    公开(公告)号:US20240143586A1

    公开(公告)日:2024-05-02

    申请号:US17979643

    申请日:2022-11-02

    Applicant: SAP SE

    CPC classification number: G06F16/24537 G06F16/24545

    Abstract: Mechanisms are disclosed for estimating cardinality of group-by queries. A probability of occurrence of values is obtained for columns that satisfy the query occurring in tables from a trained machine learning model. A range selectivity is calculated based on a conditional probability of occurrence of the values. A set of valid generated sample tuples is generated from the trained machine learning model. A group-by selectivity is calculated by keeping the conditional probability of occurrence to obtain probabilities that a result set will have specific group-by column values associated with the tables while proceeding with progressive sampling. A sampling probability is calculated by normalizing the group-by selectivity by dividing the group-by selectivity by the range selectivity. The samples are filtered such that the samples having a sampling probability below a sampling probability threshold are filtered out. A sampling-based estimator is applied to the filtered samples set to estimate the cardinality.

    Near-memory acceleration for database operations

    公开(公告)号:US11586630B2

    公开(公告)日:2023-02-21

    申请号:US16897138

    申请日:2020-06-09

    Applicant: SAP SE

    Abstract: Despite the increase of memory capacity and CPU computing power, memory performance remains the bottleneck of in-memory database management systems due to ever-increasing data volumes and application demands. Because the scale of data workloads has out-paced traditional CPU caches and memory bandwidth, one can improve data movement from memory to computing units to improve performance in in-memory database scenarios. A near-memory database accelerator framework offloads data-intensive database operations via or to a near-memory computation engine. The database accelerator's system architecture can include a database accelerator software module/driver and a memory module with a database accelerator engine. An application programming interface (API) can be provided to support database accelerator functionality. Memory of the database accelerator can be directly accessible by the CPU.

    NEAR-MEMORY ACCELERATION FOR DATABASE OPERATIONS

    公开(公告)号:US20210271680A1

    公开(公告)日:2021-09-02

    申请号:US16897138

    申请日:2020-06-09

    Applicant: SAP SE

    Abstract: Despite the increase of memory capacity and CPU computing power, memory performance remains the bottleneck of in-memory database management systems due to ever-increasing data volumes and application demands. Because the scale of data workloads has out-paced traditional CPU caches and memory bandwidth, one can improve data movement from memory to computing units to improve performance in in-memory database scenarios. A near-memory database accelerator framework offloads data-intensive database operations via or to a near-memory computation engine. The database accelerator's system architecture can include a database accelerator software module/driver and a memory module with a database accelerator engine. An application programming interface (API) can be provided to support database accelerator functionality. Memory of the database accelerator can be directly accessible by the CPU.

Patent Agency Ranking