Systems, methods and computer program products for reducing hash table working-set size for improved latency and scalability in a processing system
    1.
    发明授权
    Systems, methods and computer program products for reducing hash table working-set size for improved latency and scalability in a processing system 有权
    用于减少散列表工作集大小的系统,方法和计算机程序产品,以提高处理系统中的延迟和可扩展性

    公开(公告)号:US09069810B2

    公开(公告)日:2015-06-30

    申请号:US13558178

    申请日:2012-07-25

    IPC分类号: G06F17/30 G06F12/08

    摘要: System, method and computer program products for storing data by computing a plurality of hash functions of data values in a data item, and determining a corresponding memory location for one of the plurality of hash functions of data values in the data item. Each memory location is of a cacheline size wherein a data item is stored in a memory location. Each memory location can store a plurality of data items. A key portion of all data items is contiguously stored within the memory location, and a payload portion is contiguously stored within the memory location. Payload portions are packed as bit-aligned in a fixed-sized memory location, comprising a bucket in a bucketized hash table, each bucket sized to store multiple key portions and payload portions that are packed as bit-aligned in a fixed-sized bucket. Corresponding key portions are stored as compressed keys in said fixed-sized bucket.

    摘要翻译: 用于通过计算数据项中的数据值的多个散列函数来存储数据的系统,方法和计算机程序产品,以及确定数据项中数据值的多个哈希函数之一的相应存储器位置。 每个存储器位置具有高速缓存行大小,其中数据项被存储在存储器位置中。 每个存储器位置可以存储多个数据项。 所有数据项的关键部分被连续地存储在存储器位置内,并且有效载荷部分被连续地存储在存储器位置内。 有效载荷部分在固定大小的存储器位置中以比特对齐的方式打包,包括桶形哈希表中的桶,每个桶的大小设置为存储多个密钥部分和在固定大小的桶中以比特排列方式打包的有效载荷部分。 对应的密钥部分作为压缩密钥存储在所述固定大小的桶中。

    Predicate pushdown with late materialization in database query processing
    3.
    发明授权
    Predicate pushdown with late materialization in database query processing 有权
    谓词下推与数据库查询处理后期实现

    公开(公告)号:US08856103B2

    公开(公告)日:2014-10-07

    申请号:US13587377

    申请日:2012-08-16

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30315 G06F17/30463

    摘要: Embodiments of the present invention provide query processing for column stores by accumulating table record attributes during application of query plan operators on a table. The attributes and associated attribute values are compacted when said attribute values are to be consumed for an operation in the query plan, during the execution of the query plan. Table column record values are materialized late in query plan execution.

    摘要翻译: 本发明的实施例通过在表上的查询计划操作符应用期间累加表记录属性来提供列存储的查询处理。 在执行查询计划期间,当查询计划中的操作要使用所述属性值时,属性和关联的属性值将被压缩。 表列记录值在查询计划执行时间较晚。

    Adaptive lazy merging
    4.
    发明授权
    Adaptive lazy merging 失效
    自适应懒惰合并

    公开(公告)号:US08676865B2

    公开(公告)日:2014-03-18

    申请号:US12123598

    申请日:2008-05-20

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30958

    摘要: A query processing method intersects two or more unsorted lists based on a conjunction of predicates. Each list comprises a union of multiple sorted segments. The method performs lazy segment merging and an adaptive n-ary intersecting process. The lazy segment merging comprises starting with each list being a union of completely unmerged segments, such that lookups into a given list involve separate lookups into each segment of the given list. The method intersects the lists according to the predicates while performing the lazy segment merging, such that the lazy segment merging reads in only those portions of each segment that are needed for the intersecting. As the intersecting proceeds and the lookups are performed, the intersecting selectively merges the segments together, based on a cost-benefit analysis of the cost of merging compared to the benefit produced by reducing a number of lookups.

    摘要翻译: 查询处理方法基于谓词的连接与两个或多个未排序的列表相交。 每个列表包括多个排序段的并集。 该方法执行延迟段合并和自适应n元相交处理。 懒惰段合并包括从每个列表开始,这些列是完全未合并的段的并集,使得到给定列表中的查找涉及到给定列表的每个段的单独查找。 该方法在执行延迟段合并时根据谓词与列表相交,使得懒惰段合并仅读取相交所需的每个段的那些部分。 当相交进行和查找被执行时,相交的选择性地将段合在一起,基于与合并成本的成本效益分析相比,通过减少多个查找产生的收益。

    Avoiding three-valued logic in predicates on dictionary-encoded data
    5.
    发明授权
    Avoiding three-valued logic in predicates on dictionary-encoded data 有权
    在字典编码数据的谓词中避免三值逻辑

    公开(公告)号:US08533179B2

    公开(公告)日:2013-09-10

    申请号:US13544583

    申请日:2012-07-09

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30312 H03M7/3088

    摘要: According to one embodiment of the present invention, a method for dictionary encoding data without using three-valued logic is provided. According to one embodiment of the invention, a method includes encoding data in a database table using a dictionary, wherein the data includes values representing NULLs. A query having a predicate is received and the predicate is evaluated on the encoded data, whereby the predicate is evaluated on both the encoded data and on the encoded NULLs.

    摘要翻译: 根据本发明的一个实施例,提供了一种用于字典编码数据而不使用三值逻辑的方法。 根据本发明的一个实施例,一种方法包括使用字典对数据库表中的数据进行编码,其中数据包括表示NULL的值。 接收到具有谓词的查询,并且对编码数据评估谓词,由此在编码数据和编码的NULL上对谓词进行评估。

    Adaptive cell-specific dictionaries for frequency-partitioned multi-dimensional data
    6.
    发明授权
    Adaptive cell-specific dictionaries for frequency-partitioned multi-dimensional data 失效
    用于频率分割的多维数据的自适应小区特定字典

    公开(公告)号:US08442988B2

    公开(公告)日:2013-05-14

    申请号:US12939605

    申请日:2010-11-04

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30592

    摘要: A cell-specific dictionary is applied adaptively to adequate cells, where the cell-specific dictionary subsequently optimizes the handling of frequency-partitioned multi-dimensional data. This includes improved data partitioning with super cells or adjusting resulting cells by sub-dividing very large cells and merging multiple small cells, both of which avoid the highly skewed data distribution in cells and improve the query processing. In addition, more efficient encoding is taught within a cell in case the distinct values that actually appear in that cell are much smaller than the size of the column dictionary.

    摘要翻译: 小区特定字典自适应地应用于适当的小区,其中小区特定字典随后优化频分区多维数据的处理。 这包括使用超级单元的改进的数据分区或通过划分非常大的单元并合并多个小单元来调整所得到的单元,这两者都避免了单元格中高度偏斜的数据分布并改进查询处理。 另外,如果在该单元中实际出现的不同值远小于列字典的大小,则在单元格内教授更有效的编码。

    System, method, and apparatus for scan-sharing for business intelligence queries in an in-memory database
    7.
    发明授权
    System, method, and apparatus for scan-sharing for business intelligence queries in an in-memory database 失效
    用于内存数据库中商业智能查询的扫描共享的系统,方法和装置

    公开(公告)号:US08352945B2

    公开(公告)日:2013-01-08

    申请号:US12539471

    申请日:2009-08-11

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30445

    摘要: A computer-implemented method for scan sharing across multiple cores in a business intelligence (BI) query. The method includes receiving a plurality of BI queries, storing a block of data in a first cache, scanning the block of data in the first cache against a first batch of queries on a first processor core, and scanning the block of data against a second batch of queries on a second processor core. The first cache is associated with a first processor core. The block of data includes a subset of data stored in an in-memory database (IMDB). The first batch of queries includes two or more of the BI queries. The second batch of queries includes one or more of the BI queries that are not included in the first batch of queries.

    摘要翻译: 一种用于在商业智能(BI)查询中跨多个核心进行扫描共享的计算机实现的方法。 该方法包括接收多个BI查询,将数据块存储在第一高速缓存中,针对第一处理器核心上的第一批查询扫描第一高速缓存中的数据块,并且针对第二缓冲区扫描数据块 批次在第二个处理器核心上的查询。 第一缓存与第一处理器核心相关联。 数据块包括存储在内存数据库(IMDB)中的数据子集。 第一批查询包括两个或多个BI查询。 第二批查询包括未包含在第一批查询中的一个或多个BI查询。

    COMPACT AGGREGATION WORKING AREAS FOR EFFICIENT GROUPING AND AGGREGATION USING MULTI-CORE CPUS
    8.
    发明申请
    COMPACT AGGREGATION WORKING AREAS FOR EFFICIENT GROUPING AND AGGREGATION USING MULTI-CORE CPUS 失效
    使用多核心CPUs进行有效分组和聚合的紧凑聚合工作区域

    公开(公告)号:US20120078980A1

    公开(公告)日:2012-03-29

    申请号:US12889789

    申请日:2010-09-24

    IPC分类号: G06F17/30 G06F12/08

    CPC分类号: G06F17/30501 G06F17/30489

    摘要: A system is described for creating compact aggregation working areas for efficient grouping and aggregation using multi-core CPUs. The system implements operations including computing a running aggregate for a group within a business intelligence (BI) query, and identifying a location to store running aggregate information within an aggregation working area of a cache. The aggregation working area includes first and second data structures. The first data structure stores running aggregate information that is associated with a group that is accessed frequently relative to a threshold. The second data structure stores running aggregate information that is associated with a group that is accessed infrequently relative to the threshold. The operations also include storing the running aggregate information in either the first or second data structure of the aggregation working area based on a characterization of the group as a frequently or infrequently accessed group.

    摘要翻译: 描述了一种系统,用于创建紧凑的聚合工作区域,以便使用多核CPU进行有效的分组和聚合。 系统实现操作,包括计算商业智能(BI)查询中的组的运行聚合,以及标识在高速缓存的聚合工作区域内存储运行聚合信息的位置。 聚合工作区包括第一和第二数据结构。 第一数据结构存储与经常相对于阈值被访问的组相关联的运行聚合信息。 第二数据结构存储与相对于阈值不经常访问的组相关联的运行聚合信息。 所述操作还包括基于所述组的特征化将所述运行的聚合信息存储在所述聚合工作区域的第一或第二数据结构中,作为频繁或不经常访问的组。

    Applying various hash methods used in conjunction with a query with a group by clause
    9.
    发明授权
    Applying various hash methods used in conjunction with a query with a group by clause 失效
    将与查询结合使用的各种哈希方法应用于group by子句

    公开(公告)号:US08108401B2

    公开(公告)日:2012-01-31

    申请号:US12057979

    申请日:2008-03-28

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/30489

    摘要: A novel method is described for applying various hash methods used in conjunction with a query with a Group By clause. A plurality of drawers are identified, wherein each of the drawers is made up of a collection of cells from a single partition of a Group By column and each of the drawers being defined for a specific query. A separate hash table is independently computed for each of the drawers and a hashing scheme (picked from among a plurality of hashing schemes) is independently applied for each of the drawers.

    摘要翻译: 描述了一种新颖的方法来应用与Group By子句一起使用的查询结合使用的各种哈希方法。 识别多个抽屉,其中每个抽屉由来自分组列的单个分区的单元的集合组成,并且每个抽屉被定义用于特定查询。 对于每个抽屉独立地计算单独的散列表,并且对于每个抽屉独立地应用散列方案(从多个散列方案中挑选)。

    Method for laying out fields in a database in a hybrid of row-wise and column-wise ordering
    10.
    发明授权
    Method for laying out fields in a database in a hybrid of row-wise and column-wise ordering 有权
    在数据库中以行和列顺序排列字段的方法

    公开(公告)号:US08099440B2

    公开(公告)日:2012-01-17

    申请号:US12192504

    申请日:2008-08-15

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30315 G06F17/30519

    摘要: A method, system, and article are provided for employment of a hybrid layout of representation of data objects in computer memory. Columns of the database are separated based upon a classification of the columns. A vertical partition in the form of a bank is provided to receive an assignment of one or more data objects identified in the columns. Each bank is sized to be a divisor of a size of an associated hardware register. Assignment of data objects to banks organizes the data in a manner that support efficient query processing that mitigates the quantity of banks required to respond to the query.

    摘要翻译: 提供了一种方法,系统和文章,用于使用计算机内存中数据对象表示的混合布局。 基于列的分类来分隔数据库的列。 提供呈银行形式的垂直分区以接收在列中识别的一个或多个数据对象的分配。 每个银行的大小都是相关硬件寄存器大小的除数。 将数据对象分配给银行以支持有效查询处理的方式组织数据,以减轻响应查询所需的银行数量。