VECTORIZED SORTED-SET INTERSECTION USING CONFLICT-DETECTION INSTRUCTIONS OPTIMIZED FOR SMALL UNPADDED ORDERED SETS

    公开(公告)号:US20220222081A1

    公开(公告)日:2022-07-14

    申请号:US17148951

    申请日:2021-01-14

    Abstract: A method includes determining, whether: a first case is applicable, in which a first number of values of a first dataset and a second number of values of a second dataset total less than or equal to a third number of values of a register; a second case is applicable, in which the first and second numbers total more than the third number, and the first or second number is less than or equal to half of the third number; or a third case is applicable, in which the first and second numbers total more than the third number, and each of the first and second numbers is greater than half of the third number. In response to the determining, the method includes selectively loading to the register a first portion of the first dataset and a second portion of the second dataset, and performing conflict-detection for identifying one or more common values in the register loaded with the first portion and the second portion.

    Two-valued logic primitives for SQL query processing

    公开(公告)号:US11269874B2

    公开(公告)日:2022-03-08

    申请号:US16823632

    申请日:2020-03-19

    Abstract: The present invention relates to data manipulation language (DML) acceleration. Herein are database techniques to use value range analysis and range-limited execution operators when a value is excluded. In an embodiment, a computer receives a data access request that specifies an expression that includes operator(s), including a particular operator that has argument(s) that has only three possible values. Before detecting the actual value of a particular argument, the computer detects that a particular value of the three possible values is excluded for the particular argument. Responsively, an implementation of the particular operator that never accepts the particular value for the particular argument is selected. Applying the expression to fulfil the data access request entails invoking the selected implementation of the particular operator.

    VECTORIZED QUEUES FOR SHORTEST-PATH GRAPH SEARCHES

    公开(公告)号:US20210271711A1

    公开(公告)日:2021-09-02

    申请号:US16803832

    申请日:2020-02-27

    Abstract: Techniques are described for a vectorized queue, which implements a vectorized ‘contains’ function that determines whether a value is in the queue. A three-phase vectorized shortest-path graph search splits each expanding and probing iteration into three phases that utilize vectorized instructions: (1) The neighbors of nodes that are in a next queue are fetched and written into a current queue. (2) It is determined whether the destination node is among the fetched neighbor nodes in the current queue. (3) The fetched neighbor nodes that have not yet been visited are put into the next queue. According to an embodiment, a vectorized copy operation performs vector-based data copying using vectorized load and store instructions. Specifically, vectors of data are copied from a source to a destination. Any invalid data copied to the destination is overwritten, either with a vector of additional valid data or with a vector of nonce data.

    Non-disruptive referencing of special purpose operators for database management systems

    公开(公告)号:US10990596B2

    公开(公告)日:2021-04-27

    申请号:US16442015

    申请日:2019-06-14

    Abstract: Approaches herein transparently delegate data access from a relational database management system (RDBMS) onto an offload engine (OE). The RDBMS receives a database statement referencing a user defined function (UDF). In an execution plan, the RDBMS replaces the UDF reference with an invocation of a relational operator in the OE. Execution invokes the relational operator in the OE to obtain a result based on data in the OE. Thus, the UDF is bound to the OE, and almost all of the RDBMS avoids specially handling the UDF. The UDF may be a table function that offloads a relational table for processing. User defined objects such as functions and types provide metadata about the table. Multiple tables can be offloaded and processed together, such that some or all offloaded tables are not materialized in the RDBMS. Offloaded tables may participate in standard relational algebra such as in a database statement.

    INFLUENCING PLAN GENERATION IN THE CONTEXT OF THE TWO PHASE QUERY OPTIMIZATION APPROACH

    公开(公告)号:US20210026855A1

    公开(公告)日:2021-01-28

    申请号:US16519794

    申请日:2019-07-23

    Abstract: Techniques are described herein for influencing plan generation in context of the two phase query optimization approach. Types of pruning criteria including method pruning criteria, total cost pruning criteria, and permutation pruning criteria exist in cost-based plan generators to determine what parts of a query statement should be offloaded to a query offload engine. Method pruning criteria is responsible to determine an optimal joining method. Total cost pruning criteria compares accumulated costs with a lowest plan cost determined so far. Permutation pruning criteria is responsible for selecting the cheapest query execution plan from all considered query execution plans. Each type of pruning criteria is modified to favor offload engine execution upon request.

    Version control based on a dual-range validity model

    公开(公告)号:US09811560B2

    公开(公告)日:2017-11-07

    申请号:US14824920

    申请日:2015-08-12

    CPC classification number: G06F17/30448 G06F17/30345 G06F17/30353

    Abstract: Techniques related to version control based on a dual-range validity model are disclosed. In an embodiment, an online analytical processing (OLAP) server stores a plurality of version records describing versions of a data item. A version record may describe any open transactions for a version of the data item. The version record may specify a commit timestamp for the data item at a database and a valid timestamp at least as great as the commit timestamp. The commit timestamp and the valid timestamp may specify a validity range. The version record may also specify an expiration timestamp, which along with the valid timestamp may specify an unresolved range. The OLAP server may also identify a valid version of the data item for a query timestamp that corresponds to a query for particular data in the data item and that falls within either the validity range or the unresolved range.

    Method for generic vectorized d-heaps

    公开(公告)号:US11379232B2

    公开(公告)日:2022-07-05

    申请号:US16399226

    申请日:2019-04-30

    Abstract: Techniques are provided for obtaining generic vectorized d-heaps for any data type for which horizontal aggregation SIMD instructions are not available, including primitive as well as complex data types. A generic vectorized d-heap comprises a prefix heap and a plurality of suffix heaps. Each suffix heap of the plurality of suffix heaps comprises a d-heap. A plurality of key values stored in the heap are split into key prefix values and key suffix values. Key prefix values are stored in the prefix heap and key suffix values are stored in the plurality of suffix heaps. Each entry in the prefix heap includes a key prefix value of the plurality of key values and a reference to the suffix heap of the plurality of suffix heaps that includes all key suffix values of the plurality of key values that share the respective key prefix value.

    CODE DICTIONARY GENERATION BASED ON NON-BLOCKING OPERATIONS

    公开(公告)号:US20210390089A1

    公开(公告)日:2021-12-16

    申请号:US17459447

    申请日:2021-08-27

    Abstract: Techniques related to code dictionary generation based on non-blocking operations are disclosed. In some embodiments, a column of tokens includes a first token and a second token that are stored in separate rows. The column of tokens is correlated with a set of row identifiers including a first row identifier and a second row identifier that is different from the first row identifier. Correlating the column of tokens with the set of row identifiers involves: storing a correlation between the first token and the first row identifier, storing a correlation between the second token and the second row identifier if the first token and the second token have different values, and storing a correlation between the second token and the first row identifier if the first token and the second token have identical values. After correlating the column of tokens with the set of row identifiers, duplicate correlations are removed.

Patent Agency Ranking