Faster access for compressed time series data: the block index

    公开(公告)号:US11263196B2

    公开(公告)日:2022-03-01

    申请号:US16358598

    申请日:2019-03-19

    申请人: SAP SE

    摘要: A system and method for faster access for compressed time series data. A set of blocks are generated based on a table stored in a database of the data platform. The table stores data associated with multiple sources of data provided as consecutive values, each block containing index vectors having a range of the consecutive values. A block index is generated for each block having a field start vector representing a starting position of the block relative to the range of consecutive values, and a starting value vector representing a value of the block at the starting position. The field start vector of the block index is accessed to obtain the starting position of a field corresponding to a first block and to the range of the consecutive values of the first block. The starting value vector is then determined from the block index to determine an end and a length of the field of the first block.

    Value-ID-based sorting in column-store databases

    公开(公告)号:US10762071B2

    公开(公告)日:2020-09-01

    申请号:US15363274

    申请日:2016-11-29

    申请人: SAP SE

    IPC分类号: G06F7/00 G06F16/22

    摘要: Innovations in performing sort operations for dictionary-compressed values of columns in a column-store database using value identifiers (“IDs”) are described. For example, a database system includes a data store and an execution engine. The data store stores values at positions of a column A dictionary maps distinct values to corresponding value IDs. An inverted index stores, for each of the corresponding value IDs, a list of those of the positions that contain the associated distinct value. The execution engine processes a request to sort values at an input set of the positions and identify an output set of the positions for sorted values. In particular, the execution engine iterates through positions stored in the lists of the inverted index. For a given position, the execution engine checks if the given position is one of the input set and, if so, adds the given position to the output set.

    Processing a query primitive call on a value identifier set

    公开(公告)号:US10671625B2

    公开(公告)日:2020-06-02

    申请号:US15416729

    申请日:2017-01-26

    申请人: SAP SE

    摘要: In some example embodiments, a system is provided for executing a primitive call that implements a query operation. The system may include a data processor and a memory. The memory may store instructions that result in operations when executed by the data processor. The operations may include: executing, at an data management engine, the primitive call by at least performing a first operation with respect to a value identifier set, the value identifier set including one or more value identifiers, and the primitive call being configured to access a database storing a plurality of value identifiers; and generating, based at least on a result of the first operation, a result for the primitive call. Related methods and articles of manufacture, including computer program products, are also described.

    VALUE IDENTIFIER SETS
    5.
    发明申请

    公开(公告)号:US20180210926A1

    公开(公告)日:2018-07-26

    申请号:US15416729

    申请日:2017-01-26

    申请人: SAP SE

    IPC分类号: G06F17/30

    摘要: In some example embodiments, a system is provided for executing a primitive call that implements a query operation. The system may include a data processor and a memory. The memory may store instructions that result in operations when executed by the data processor. The operations may include: executing, at an data management engine, the primitive call by at least performing a first operation with respect to a value identifier set, the value identifier set including one or more value identifiers, and the primitive call being configured to access a database storing a plurality of value identifiers; and generating, based at least on a result of the first operation, a result for the primitive call. Related methods and articles of manufacture, including computer program products, are also described.

    Framework for workload prediction and physical database design

    公开(公告)号:US11789920B1

    公开(公告)日:2023-10-17

    申请号:US17705728

    申请日:2022-03-28

    申请人: SAP SE

    摘要: According to some embodiments, methods and systems may be associated with a cloud computing environment. A workload prediction framework may receive observed workload information associated with a database in the cloud computing environment (e.g., a DataBase as a Service (“DBaaS”)). Based on the observed workload information, a Statement Arrival Rate (“SAR”) prediction may be generated. In addition, a host variable assignment prediction may be generated based on the observed workload information. The workload prediction framework may then use the SAR prediction and the host variable assignment prediction to automatically create a workload prediction for the database. A physical database design advisor (e.g., a table partitioning advisor) may receive the workload prediction and, responsive to the workload prediction, automatically generate a recommended physical layout for the database (e.g., using a cost model, the current physical layout, and an objective function).

    DESIGN AND IMPLEMENTATION OF DATA ACCESS METRICS FOR AUTOMATED PHYSICAL DATABASE DESIGN

    公开(公告)号:US20220269653A1

    公开(公告)日:2022-08-25

    申请号:US17324914

    申请日:2021-05-19

    申请人: SAP SE

    IPC分类号: G06F16/21 G06F16/22 G06F11/34

    摘要: The present disclosure involves systems, software, and computer implemented methods for improved design and implementation of data access metrics for automated physical database design. An example method includes identifying a database workload for which index advisor access counters are to be tracked. Each SQL statement in the database workload is executed. For each SQL statement, attribute sets are determined for which a selection predicate filters a result for an SQL statement. An output cardinality of each selection predicate is determined. A logarithmic counter for an attribute set corresponding to the selection predicate is determined based on the output cardinality of the selection predicate. The determined logarithmic counter is incremented. Respective values for logarithmic counters of the determined attributes are provided to an index advisor. The index advisor determines attribute sets for which to propose an index based on the logarithmic counters of the respective attribute sets.

    Compressing time stamp columns
    9.
    发明授权

    公开(公告)号:US11386104B2

    公开(公告)日:2022-07-12

    申请号:US16661993

    申请日:2019-10-23

    申请人: SAP SE

    IPC分类号: G06F16/2458 G06F16/22

    摘要: Disclosed is a system and method for improving database memory consumption and performance using compression of time stamp columns. A number of time stamps of a time series is received. The time stamps have a start time, and are separated by an equal increment of time that defines an interval. The start time and interval are stored in a dictionary of a column store of a database. An index is generated in the column store of the database, the index having a number of index vectors. Using the index vectors, each time stamp of the number of time stamps can be calculated from the start time stored in the dictionary and the position in the time series based on the interval stored in the dictionary.

    Data access and recommendation system

    公开(公告)号:US11308047B2

    公开(公告)日:2022-04-19

    申请号:US16816511

    申请日:2020-03-12

    申请人: SAP SE

    摘要: System, method, and various embodiments for providing a data access and recommendation system are described herein. An embodiment operates by identifying a column access of one or more data values of a first column of a plurality of columns of a table of a database during a sampling period. A count of how many of the one or more data values are accessed during the column access are recorded. A first counter, corresponding to the first column and stored in a distributed hash table, is incremented by the count. The sampling period is determined to have expired. A load recommendation on how to load data values into the first column based on the first counter is computed. The load recommendation for implementation into the database for one or more subsequent column accesses is provided.