Adaptive compression optimization for effective pruning

    公开(公告)号:US12204517B2

    公开(公告)日:2025-01-21

    申请号:US17933903

    申请日:2022-09-21

    Applicant: SAP SE

    Abstract: A database management system is described that can encode data to generate a plurality of data vectors. The database management system can perform the encoding by using a dictionary. The database management system can adaptively reorder the plurality of data vectors to prepare for compression of the plurality of data vectors. During a forward pass of the adaptive reordering, most frequent values of a data vector of the plurality of data vectors can be moved-up in the data vector. During a backward pass of the adaptive reordering, content within a rest range of a plurality of rest ranges can be rearranged within the plurality of data vectors according to frequencies of the content. The reordering according to frequency can further sort the rest range by value. Related apparatuses, systems, methods, techniques, computer programmable products, computer readable media, and articles are also described.

    Paged Inverted Index
    3.
    发明申请

    公开(公告)号:US20170154061A1

    公开(公告)日:2017-06-01

    申请号:US14954736

    申请日:2015-11-30

    Applicant: SAP SE

    Abstract: Disclosed herein are system and method embodiments for generating a paged inverted index. An embodiment is generated by storing a first data structure and the second data structure in a plurality of pages, where the plurality of pages are stored in the one or more memories. The first data structure is stored in the plurality of pages and includes a plurality of value identifiers, where a value identifier corresponds to an offset. The second data structure stored in the plurality of pages includes a plurality of row positions, wherein a row position is at a location that corresponds to the offset in the first data structure and identifies a position of row in a table that stores data associated with the value ID.

    Consistency checks for compressed data

    公开(公告)号:US12197419B2

    公开(公告)日:2025-01-14

    申请号:US17974209

    申请日:2022-10-26

    Applicant: SAP SE

    Abstract: Systems and methods include reception of an instruction to perform a consistency check on compressed column data. In response to the instruction, a compression algorithm applied to uncompressed column data to generate the compressed column data is determined, one or more consistency checks associated with the compression algorithm are determined, wherein a first one or more consistency checks associated with a first compression algorithm are different from a second one or more consistency checks associated with a second compression algorithm, the one or more consistency checks are executed on the compressed column data, and, if the one or more consistency checks are not satisfied, a notification is transmitted to a user.

    Storage of run-length encoded database column data in non-volatile memory

    公开(公告)号:US10719450B2

    公开(公告)日:2020-07-21

    申请号:US16229313

    申请日:2018-12-21

    Applicant: SAP SE

    Abstract: A system in which a volatile random access memory stores first header data, second header data, a first logical array in a first contiguous memory block and a second logical array in a second contiguous memory block. Each array position of the first logical array stores a database column value, and each array position of the second logical array stores an indication of a number of consecutive occurrences of a database column value. The first header data includes a first pointer to the first memory block, and the second header data includes a second pointer to the second memory block. A memory size is determined associated with the first header data, the second header data, the first memory block, and the second memory block, a first memory block of the non-volatile random access memory is allocated based on the determined memory size, an address of the random access memory associated with the allocated first memory block is determined, and a portion of the first header data, a portion of the second header data, a binary copy of the first memory block and a binary copy of the second memory block are written at the address of the random access memory.

    Paged inverted index
    6.
    发明授权

    公开(公告)号:US10140326B2

    公开(公告)日:2018-11-27

    申请号:US14954736

    申请日:2015-11-30

    Applicant: SAP SE

    Abstract: Disclosed herein are system and method embodiments for generating a paged inverted index. An embodiment is generated by storing a first data structure and the second data structure in a plurality of pages, where the plurality of pages are stored in the one or more memories. The first data structure is stored in the plurality of pages and includes a plurality of value identifiers, where a value identifier corresponds to an offset. The second data structure stored in the plurality of pages includes a plurality of row positions, wherein a row position is at a location that corresponds to the offset in the first data structure and identifies a position of row in a table that stores data associated with the value ID.

    Compression determination for column store

    公开(公告)号:US11681676B2

    公开(公告)日:2023-06-20

    申请号:US17357097

    申请日:2021-06-24

    Applicant: SAP SE

    CPC classification number: G06F16/221 G06F16/2282 H03M7/6064

    Abstract: A system includes application of respective compression types to first data associated with each of a plurality of columns to generate compressed column data, determination of a first compression ratio for each of the plurality of columns based on the compressed column data, storage of the determined first compression ratios, application, for each of the plurality of columns, of the determined compression type to second data associated with the column to generate second compressed column data, determination of a second compression ratio for each of the plurality of columns based on the second compressed column data, determination of a value for each column based on the stored first compression ratio and the second compression ratio determined for the column, determination of a representative value of the determined values, and determination, based on the representative value, whether to re-determine a compression type for each of the plurality of columns.

    Reordering of enriched inverted indices

    公开(公告)号:US10452693B2

    公开(公告)日:2019-10-22

    申请号:US15482518

    申请日:2017-04-07

    Applicant: SAP SE

    Abstract: A method can include: reordering an enriched inverted index associated with a database, the enriched inverted index including a first inverted list having a first plurality of current document identifiers of records that contain a first data value, the enriched inverted index further including a first data structure storing enrichment data, the reordering of the enriched inverted index comprising: generating an ordinal sequence corresponding to an order of a first plurality of current document identifiers that include a change of at least one of the first plurality of current document identifiers to a new document identifier; determining a reordered ordinal sequence corresponding to a sorted order of the second plurality of document identifiers; separately reordering, based at least on the reordered ordinal sequence, the first plurality of current document identifiers in the first inverted list and the enrichment data in the first data structure.

    ADAPTIVE COMPRESSION OPTIMIZATION FOR EFFECTIVE PRUNING

    公开(公告)号:US20230025952A1

    公开(公告)日:2023-01-26

    申请号:US17933903

    申请日:2022-09-21

    Applicant: SAP SE

    Abstract: A database management system is described that can encode data to generate a plurality of data vectors. The database management system can perform the encoding by using a dictionary. The database management system can adaptively reorder the plurality of data vectors to prepare for compression of the plurality of data vectors. During a forward pass of the adaptive reordering, most frequent values of a data vector of the plurality of data vectors can be moved-up in the data vector. During a backward pass of the adaptive reordering, content within a rest range of a plurality of rest ranges can be rearranged within the plurality of data vectors according to frequencies of the content. The reordering according to frequency can further sort the rest range by value. Related apparatuses, systems, methods, techniques, computer programmable products, computer readable media, and articles are also described.

Patent Agency Ranking