Method and system for creating an in-memory physical dictionary for data compression
    42.
    发明授权
    Method and system for creating an in-memory physical dictionary for data compression 有权
    用于创建用于数据压缩的内存中物理字典的方法和系统

    公开(公告)号:US07973680B2

    公开(公告)日:2011-07-05

    申请号:US12172557

    申请日:2008-07-14

    IPC分类号: H03M7/34

    CPC分类号: H03M7/3088

    摘要: A system and computer readable storage medium for creating an in-memory physical dictionary for data compression are provided. A new heuristic is defined for converting each of a plurality of logical nodes into a corresponding physical node forming a plurality of physical nodes. Each of the physical nodes are placed into the physical dictionary while traversing the dictionary tree in descending visit count order. Each physical node is placed in its nearest ascendant's cache-line with sufficient space. If there is no space in any of the ascendant's cache-line, then the physical node is placed into a new cache-line, unless a pre-defined packing threshold has been reached, in which case the physical node is placed in the first available cache-line.

    摘要翻译: 提供了一种用于创建用于数据压缩的内存中物理字典的系统和计算机可读存储介质。 定义了一种新的启发式算法,用于将多个逻辑节点中的每一个转换成形成多个物理节点的对应物理节点。 每个物理节点都以递减的访问次序顺序遍历字典树,放入物理字典。 每个物理节点都放置在其最接近的上升缓存行,并具有足够的空间。 如果任何上升缓存行中没有空格,则物理节点将被放置到新的高速缓存行中,除非已达到预定义的打包阈值,在这种情况下,物理节点被置于第一个可用的 缓存线。

    METHOD AND SYSTEM FOR CREATING AN IN-MEMORY PHYSICAL DICTIONARY FOR DATA COMPRESSION
    43.
    发明申请
    METHOD AND SYSTEM FOR CREATING AN IN-MEMORY PHYSICAL DICTIONARY FOR DATA COMPRESSION 有权
    用于创建用于数据压缩的存储器物理字典的方法和系统

    公开(公告)号:US20080275897A1

    公开(公告)日:2008-11-06

    申请号:US12172557

    申请日:2008-07-14

    IPC分类号: G06F17/30

    CPC分类号: H03M7/3088

    摘要: Some aspects of the invention provide methods, systems, and computer program products for creating an in-memory physical dictionary for data compression. To that end, in accordance with aspects of the present invention, a new heuristic is defined for converting each of the plurality of logical nodes into a corresponding physical node forming a plurality of physical nodes; then place each of the physical nodes into the physical dictionary while traversing the dictionary tree in descending visit count order. Each physical node is placed in its nearest ascendant's cache-line with sufficient space. If there is no space in any of the ascendant's cache-line, then the physical node is placed into a new cache-line, unless a pre-defined packing threshold has been reached, in which case the physical node is placed in the first available cache-line.

    摘要翻译: 本发明的一些方面提供了用于创建用于数据压缩的内存中物理字典的方法,系统和计算机程序产品。 为此,根据本发明的方面,新的启发式被定义为将多个逻辑节点中的每一个转换成形成多个物理节点的对应的物理节点; 然后将每个物理节点放入物理字典,同时以递减的访问次序顺序遍历字典树。 每个物理节点都放置在其最接近的上升缓存行,并具有足够的空间。 如果任何上升缓存行中没有空格,则物理节点将被放置到新的高速缓存行中,除非已达到预定义的打包阈值,在这种情况下,物理节点被置于第一个可用的 缓存线。

    Method of compressing data with an alphabet
    44.
    发明授权
    Method of compressing data with an alphabet 失效
    使用字母表压缩数据的方法

    公开(公告)号:US06262675B1

    公开(公告)日:2001-07-17

    申请号:US09471102

    申请日:1999-12-21

    IPC分类号: H03M734

    CPC分类号: H03M7/3086

    摘要: An improved LZ77 data compression and decompression method, known as Le′Z99, uses an embedded alphabet to optimize code space and speed in the compressed data.

    摘要翻译: 改进的LZ77数据压缩和解压缩方法(称为Le'Z99)使用嵌入式字母表来优化压缩数据中的代码空间和速度。

    Dimension reduction using association rules for data mining application
    45.
    发明授权
    Dimension reduction using association rules for data mining application 失效
    使用数据挖掘应用的关联规则进行尺寸缩减

    公开(公告)号:US6134555A

    公开(公告)日:2000-10-17

    申请号:US20438

    申请日:1998-02-09

    IPC分类号: G06F17/30

    摘要: A method, apparatus, and article of manufacture for a computer-implemented random reliability engine for computer-implemented association rule reduction using association rules for data mining application. The data mining is performed by the computer to retrieve data from a data store stored on a data storage device coupled to the computer. The data store has records that have multiple attributes. Attribute value associations are determined between attributes and their values. Attribute associations are determined from the determined attribute value associations. Attributes are selected based on the determined attribute associations for performing data mining.

    摘要翻译: 一种用于计算机实现的随机可靠性引擎的方法,装置和制品,用于使用数据挖掘应用的关联规则的计算机实现的关联规则减少。 数据挖掘由计算机执行以从存储在耦合到计算机的数据存储设备上的数据存储器中检索数据。 数据存储具有多个属性的记录。 属性值关联在属性及其值之间确定。 根据确定的属性值关联确定属性关联。 基于确定的用于执行数据挖掘的属性关联来选择属性。

    Enumerating projection in SQL queries containing outer and full outer
joins in the presence of inner joins
    46.
    发明授权
    Enumerating projection in SQL queries containing outer and full outer joins in the presence of inner joins 失效
    在存在内部连接的情况下,枚举包含外部和全部外部连接的SQL查询中的预测

    公开(公告)号:US6088691A

    公开(公告)日:2000-07-11

    申请号:US198643

    申请日:1998-11-24

    IPC分类号: G06F17/30

    摘要: The present invention discloses a method and apparatus for the enumeration of projections (i.e., "SELECT DISTINCT" operations) in SQL queries containing outer and full outer joins in the presence of inner joins without encountering any regression in performance. The present invention removes projections from a given user query by moving the projections to the top of an expression tree representation of the query, wherein the projection removal is performed using algebraic identities rather than rule-based transformations. The present invention also discloses several methods of enumerating different plans or schedules for projection operations and binary operations in the given user query. The present invention can significantly reduce the execution time of a query by selecting the optimal schedule for binary operations and projections between the binary operations. However, the present invention ensures that there is no regression in performance by comparing the cost of the query with the cost of enumerated plans or schedules, thereby ensuring that the optimizations or transformations do not introduce performance penalties.

    摘要翻译: 本发明公开了一种在存在内连接的情况下,在包含外连接和全外连接的SQL查询中枚举突出(即“SELECT DISTINCT”操作)的方法和装置,而不会在性能上遇到任何回归。 本发明通过将投影移动到查询的表达树表示的顶部来从给定用户查询中移除投影,其中使用代数身份而不是基于规则的变换来执行投影删除。 本发明还公开了在给定用户查询中枚举用于投影操作和二进制操作的不同计划或计划的几种方法。 本发明可以通过选择二进制操作和二进制操作之间的投影的最佳调度来显着地减少查询的执行时间。 然而,本发明通过将查询的成本与枚举的计划或计划的成本进行比较来确保性能没有回归,从而确保优化或转换不引入性能惩罚。

    Reordering of complex SQL queries involving groupbys, joins, outer joins
and full outer joins

    公开(公告)号:US5864847A

    公开(公告)日:1999-01-26

    申请号:US902975

    申请日:1997-07-30

    IPC分类号: G06F17/30

    摘要: A method, apparatus, and article of manufacture for query simplification by applying generalized inference propagation and generalized transitive closure in SQL queries having selection, projection, join, outer join, and intersection operations. The disclosed transformations and enumeration method unify and solve the problems of 1) unnesting join aggregate queries, and 2) complete enumeration of queries containing outer joins, when the outer join predicate references an aggregated value, or the predicate references more than two base relations in a query subtree. The system first eliminates redundant sub-expressions and modifies expensive binary operations to inexpensive binary operations, then converts complex predicates to simple predicates by application of a generalized selection (GS) operator.

    Method and apparatus for generating dynamic and hybrid sparse indices
for workfiles used in SQL queries
    48.
    发明授权
    Method and apparatus for generating dynamic and hybrid sparse indices for workfiles used in SQL queries 失效
    用于生成用于SQL查询的工作文件的动态和混合稀疏索引的方法和装置

    公开(公告)号:US5758145A

    公开(公告)日:1998-05-26

    申请号:US393803

    申请日:1995-02-24

    IPC分类号: G06F17/30

    摘要: A method, apparatus and article for manufacture for generating static, dynamic and hybrid sparse indices for use with workfiles used by SQL queries in a relational database management system. A workfile and a sparse index structure are temporarily created in the computer during execution of the query by the computer. The workfile stores intermediate relations resulting from execution of a portion of the SQL query, wherein the intermediate relations comprise sorted rows for an inner table referenced in the SQL query. The sparse index structure contains one or more entries indicating at least an approximate location for at least some of the rows in the workfile. As one or more rows from an outer table referenced in the SQL query are retrieved, the entries of the sparse index structure are searched for a closest matching entry for each retrieved row. The sorted workfile is then scanned for a row matching each retrieved row using the closest matching entry from the sparse index structure as a starting position. The sparse index structure is then updated with an entry corresponding to the row from the sorted workfile matching the retrieved row from the outer table.

    摘要翻译: 一种用于生成静态,动态和混合稀疏索引的方法,装置和制品,用于在关系数据库管理系统中用于SQL查询的工作文件。 在计算机执行查询期间,在计算机中临时创建工作文件和稀疏索引结构。 工作文件存储由执行SQL查询的一部分而产生的中间关系,其中中间关系包括在SQL查询中引用的内部表的排序行。 稀疏索引结构包含一个或多个条目,至少指示工作文件中至少一些行的近似位置。 当从SQL查询中引用的外部表中的一行或多行被检索到时,搜索稀疏索引结构的条目以获取每个检索行的最接近的匹配条目。 然后,使用距离稀疏索引结构最接近的匹配条目作为起始位置,扫描排序的工作文件,使其与每个检索到的行相匹配的行。 然后使用与来自外部表的检索到的行匹配的排序工作文件中的行对应的条目来更新稀疏索引结构。

    Reordering of complex SQL queries involving GROUPBYs, joins, outer joins
and full outer joins

    公开(公告)号:US5713015A

    公开(公告)日:1998-01-27

    申请号:US655300

    申请日:1996-05-30

    IPC分类号: G06F17/30

    摘要: A method, apparatus, and article of manufacture for query simplification by applying generalized inference propagation and generalized transitive closure in SQL queries having selection, projection, join, outer join, and intersection operations. The disclosed transformations and enumeration method unify and solve the problems of 1) unnesting join aggregate queries, and 2) complete enumeration of queries containing outer joins, when the outer join predicate references an aggregated value, or the predicate references more than two base relations in a query subtree. The system first eliminates redundant sub-expressions and modifies expensive binary operations to inexpensive binary operations, then converts complex predicates to simple predicates by application of a generalized selection (GS) operator.