Storing compression units in relational tables

    公开(公告)号:US11520743B2

    公开(公告)日:2022-12-06

    申请号:US14079507

    申请日:2013-11-13

    Abstract: A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user.

    Techniques for maintaining column vectors of relational data within volatile memory
    2.
    发明授权
    Techniques for maintaining column vectors of relational data within volatile memory 有权
    维护易失性存储器中关系数据列向量的技术

    公开(公告)号:US09201944B2

    公开(公告)日:2015-12-01

    申请号:US13916284

    申请日:2013-06-12

    CPC classification number: G06F17/30315 G06F9/3887 G06F17/30339 G06F17/30595

    Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.

    Abstract translation: 在执行数据库操作期间,提供了技术来更有效地使用CPU和易失性存储器之间的I / O路径的带宽。 来自关系表的关系数据作为列向量存储在易失性存储器中,其中每个列向量包含表的特定列的值。 可以使用二进制可比较的格式来表示列向量中的每个值,而不管与列相关联的数据类型如何。 列向量可以在易失性存储器中被压缩和/或编码,并且在CPU内部实时解压缩/解码。 或者,CPU可以被设计为直接对压缩和/或编码的列向量数据执行操作。 另外,描述使CPU能够对列向量值执行向量处理操作的技术。

    Compression analyzer
    4.
    发明授权
    Compression analyzer 有权
    压缩分析仪

    公开(公告)号:US09559720B2

    公开(公告)日:2017-01-31

    申请号:US13631575

    申请日:2012-09-28

    CPC classification number: H03M7/30 G06F17/30595

    Abstract: Techniques are described herein for automatically selecting the compression techniques to be used on tabular data. A compression analyzer gives users high-level control over the selection process without requiring the user to know details about the specific compression techniques that are available to the compression analyzer. Users are able to specify, for a given set of data, a “balance point” along the spectrum between “maximum performance” and “maximum compression”. The point thus selected is used by the compression analyzer in a variety of ways. For example, in one embodiment, the compression analyzer uses the user-specified balance point to determine which of the available compression techniques qualify as “candidate techniques” for the given set of data. The compression analyzer selects the compression technique to use on a set of data by actually testing the candidate compression techniques against samples from the set of data. After testing the candidate compression techniques against the samples, the resulting compression ratios are compared. The compression technique to use on the set of data is then selected based, in part, on the compression ratios achieved during the compression tests performed on the sample data.

    Abstract translation: 这里描述了用于自动选择要在表格数据上使用的压缩技术的技术。 压缩分析仪为用户提供了对选择过程的高级控制,而不需要用户了解有关压缩分析器可用的特定压缩技术的细节。 用户可以为给定的数据集指定沿“最大性能”和“最大压缩”之间的“平衡点”。 所选择的点由压缩分析器以各种方式使用。 例如,在一个实施例中,压缩分析器使用用户指定的平衡点来确定哪个可用的压缩技术符合给定的数据集合的“候选技术”。 压缩分析仪通过对来自该组数据的样本实际测试候选压缩技术来选择对一组数据使用的压缩技术。 在针对样品测试候选压缩技术之后,比较所得到的压缩比。 然后,部分地基于在对样本数据执行的压缩测试期间实现的压缩比来选择在该组数据上使用的压缩技术。

    Techniques for more efficient usage of memory-to-CPU bandwidth
    5.
    发明授权
    Techniques for more efficient usage of memory-to-CPU bandwidth 有权
    更有效地利用内存到CPU带宽的技术

    公开(公告)号:US08572131B2

    公开(公告)日:2013-10-29

    申请号:US13708054

    申请日:2012-12-07

    CPC classification number: G06F17/30315 G06F9/3887 G06F17/30339 G06F17/30595

    Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.

    Abstract translation: 在执行数据库操作期间,提供了技术来更有效地使用CPU和易失性存储器之间的I / O路径的带宽。 来自关系表的关系数据作为列向量存储在易失性存储器中,其中每个列向量包含表的特定列的值。 可以使用二进制可比较的格式来表示列向量中的每个值,而不管与列相关联的数据类型如何。 列向量可以在易失性存储器中被压缩和/或编码,并且在CPU内部实时解压缩/解码。 或者,CPU可以被设计为直接对压缩和/或编码的列向量数据执行操作。 另外,描述使CPU能够对列向量值执行向量处理操作的技术。

    TECHNIQUES FOR MORE EFFICIENT USAGE OF MEMORY-TO-CPU BANDWIDTH
    6.
    发明申请
    TECHNIQUES FOR MORE EFFICIENT USAGE OF MEMORY-TO-CPU BANDWIDTH 有权
    更高效地使用存储器到CPU带宽的技术

    公开(公告)号:US20130151567A1

    公开(公告)日:2013-06-13

    申请号:US13708054

    申请日:2012-12-07

    CPC classification number: G06F17/30315 G06F9/3887 G06F17/30339 G06F17/30595

    Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.

    Abstract translation: 在执行数据库操作期间,提供了技术来更有效地使用CPU和易失性存储器之间的I / O路径的带宽。 来自关系表的关系数据作为列向量存储在易失性存储器中,其中每个列向量包含表的特定列的值。 可以使用二进制可比较的格式来表示列向量中的每个值,而不管与列相关联的数据类型如何。 列向量可以在易失性存储器中被压缩和/或编码,并且在CPU内部实时解压缩/解码。 或者,CPU可以被设计为直接对压缩和/或编码的列向量数据执行操作。 另外,描述使CPU能够对列向量值执行向量处理操作的技术。

    Compression Analyzer
    7.
    发明申请
    Compression Analyzer 有权
    压缩分析仪

    公开(公告)号:US20130036101A1

    公开(公告)日:2013-02-07

    申请号:US13631575

    申请日:2012-09-28

    CPC classification number: H03M7/30 G06F17/30595

    Abstract: Techniques are described herein for automatically selecting the compression techniques to be used on tabular data. A compression analyzer gives users high-level control over the selection process without requiring the user to know details about the specific compression techniques that are available to the compression analyzer. Users are able to specify, for a given set of data, a “balance point” along the spectrum between “maximum performance” and “maximum compression”. The point thus selected is used by the compression analyzer in a variety of ways. For example, in one embodiment, the compression analyzer uses the user-specified balance point to determine which of the available compression techniques qualify as “candidate techniques” for the given set of data. The compression analyzer selects the compression technique to use on a set of data by actually testing the candidate compression techniques against samples from the set of data. After testing the candidate compression techniques against the samples, the resulting compression ratios are compared. The compression technique to use on the set of data is then selected based, in part, on the compression ratios achieved during the compression tests performed on the sample data.

    Abstract translation: 这里描述了用于自动选择要在表格数据上使用的压缩技术的技术。 压缩分析仪为用户提供了对选择过程的高级控制,而不需要用户了解有关压缩分析器可用的特定压缩技术的细节。 用户可以为给定的数据集指定沿最大性能和最大压缩之间的平衡点。 所选择的点由压缩分析器以各种方式使用。 例如,在一个实施例中,压缩分析器使用用户指定的平衡点来确定哪些可用的压缩技术被鉴定为用于给定的一组数据的候选技术。 压缩分析仪通过对来自该组数据的样本实际测试候选压缩技术来选择对一组数据使用的压缩技术。 在针对样品测试候选压缩技术之后,比较所得到的压缩比。 然后,部分地基于在对样本数据执行的压缩测试期间实现的压缩比来选择在该组数据上使用的压缩技术。

    STORING COMPRESSION UNITS IN RELATIONAL TABLES
    9.
    发明申请
    STORING COMPRESSION UNITS IN RELATIONAL TABLES 审中-公开
    在关系表中存储压缩单位

    公开(公告)号:US20140074805A1

    公开(公告)日:2014-03-13

    申请号:US14079507

    申请日:2013-11-13

    CPC classification number: G06F16/1744 G06F16/24561 G06F16/902

    Abstract: A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user.

    Abstract translation: 数据库服务器将压缩单位存储在数据库的数据块中。 使用各种各样的压缩技术中的任一种,首先将表(或其多行的数据)压缩为“压缩单位”。 然后,压缩单元被存储在跨越一个或多个数据块的一个或多个数据块行。 结果,单个数据块行可以包括在压缩单元内编码的多个表行的压缩数据。 数据块中的压缩单元的存储与现有的基于数据块的数据库保持兼容,从而允许在预先存在的数据库中使用压缩单元,而无需修改数据库的底层格式。 压缩单元可以例如与未压缩的表共存。 各种技术允许数据库服务器优化对压缩单元中的数据的访问,使得压缩对于用户实际上是透明的。

    TECHNIQUES FOR MAINTAINING COLUMN VECTORS OF RELATIONAL DATA WITHIN VOLATILE MEMORY
    10.
    发明申请
    TECHNIQUES FOR MAINTAINING COLUMN VECTORS OF RELATIONAL DATA WITHIN VOLATILE MEMORY 审中-公开
    维护相关数据在波形存储器中的列向量的技术

    公开(公告)号:US20130275473A1

    公开(公告)日:2013-10-17

    申请号:US13916284

    申请日:2013-06-12

    CPC classification number: G06F17/30315 G06F9/3887 G06F17/30339 G06F17/30595

    Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.

    Abstract translation: 在执行数据库操作期间,提供了技术来更有效地使用CPU和易失性存储器之间的I / O路径的带宽。 来自关系表的关系数据作为列向量存储在易失性存储器中,其中每个列向量包含表的特定列的值。 可以使用二进制可比较的格式来表示列向量中的每个值,而不管与列相关联的数据类型如何。 列向量可以在易失性存储器中被压缩和/或编码,并且在CPU内部实时解压缩/解码。 或者,CPU可以被设计为直接对压缩和/或编码的列向量数据执行操作。 另外,描述使CPU能够对列向量值执行向量处理操作的技术。

Patent Agency Ranking