Efficient data compression and analysis as a service
    61.
    发明授权
    Efficient data compression and analysis as a service 有权
    高效的数据压缩和分析服务

    公开(公告)号:US09384204B2

    公开(公告)日:2016-07-05

    申请号:US13900350

    申请日:2013-05-22

    IPC分类号: G06F17/30 H03M7/30

    摘要: Data may be efficiently analyzed and compressed as part of a data compression service. A data compression request may be received from a client indicating data to be compressed. An analysis of the data or metadata associated with the data may be performed. In at least some embodiments, this analysis may be a rules-based analysis. Some embodiments may employ one or more machine learning techniques to historical compression data to update the rules-based analysis. One or more compression techniques may be selected out of a plurality of compression techniques to be applied to the data. Data compression candidates may then be generated according to the selected compression techniques. In some embodiments, a compression service restriction may be enforced. One of the data compression candidates may be selected and sent in a response.

    摘要翻译: 数据可以作为数据压缩服务的一部分被有效地分析和压缩。 可以从客户端接收指示要压缩的数据的数据压缩请求。 可以执行与数据相关联的数据或元数据的分析。 在至少一些实施例中,该分析可以是基于规则的分析。 一些实施例可以对历史压缩数据采用一种或多种机器学习技术来更新基于规则的分析。 可以从应用于数据的多种压缩技术中选择一种或多种压缩技术。 然后可以根据选择的压缩技术生成数据压缩候选。 在一些实施例中,可以强制执行压缩服务限制。 可以在响应中选择并发送其中一个数据压缩候选。

    Variable Bit-Length Reiterative Lossless Compression System and Method
    62.
    发明申请
    Variable Bit-Length Reiterative Lossless Compression System and Method 审中-公开
    可变位长度无损压缩系统和方法

    公开(公告)号:US20150280739A1

    公开(公告)日:2015-10-01

    申请号:US14502443

    申请日:2014-09-30

    发明人: Sidney Dunayer

    IPC分类号: H03M7/40

    摘要: A computer-implemented method of performing lossless compression of a digital data set uses an iterative compression process in which the number of symbols N and bit length per symbol n may vary on successive iterations. The process includes analyzing at least a part of the data set to establish a partition thereof into N symbols of symbol length n, and to determine whether the N symbols can be further compressed, and, if so, a model to be used in encoding the N symbols.

    摘要翻译: 执行数字数据集的无损压缩的计算机实现的方法使用迭代压缩处理,其中符号N的数量和每符号n的位长度可以在连续的迭代中变化。 该过程包括分析数据集的至少一部分以将其分区建立为符号长度为n的N个符号,并且确定是否可以进一步压缩N个符号,并且如果是,则在编码中使用的模型 N个符号。

    Variable length coding and decoding using counters
    63.
    发明授权
    Variable length coding and decoding using counters 有权
    可变长度编码和解码使用计数器

    公开(公告)号:US09088296B2

    公开(公告)日:2015-07-21

    申请号:US13339913

    申请日:2011-12-29

    申请人: Bin Li Jizheng Xu

    发明人: Bin Li Jizheng Xu

    摘要: Disclosed herein are representative embodiments for performing entropy coding or decoding using a counter-based scheme. In one exemplary embodiment disclosed herein, a first codeword is received from compressed digital media data. The first codeword is decoded into a first digital media data value by referencing a codeword table that associates the first codeword with the first digital media data value and a second codeword with a second digital media data value. A counter for counting occurrences of the first digital media data value is incremented. The value of the first counter is compared with the value of a second counter that counts occurrences of a second digital media data value. If the value of the first counter and the value of the second counter are equal (or greater than or equal), the codeword table is updated to swap codewords between the first and second digital media values.

    摘要翻译: 这里公开了用于使用基于计数器的方案执行熵编码或解码的代表性实施例。 在本文公开的一个示例性实施例中,从压缩数字媒体数据接收第一码字。 通过参考将第一码字与第一数字媒体数据值相关联的码字表和第二数字媒体数据值的第二码字,将第一码字解码为第一数字媒体数据值。 用于计数第一数字媒体数据值的出现的计数器增加。 将第一计数器的值与计数第二数字媒体数据值的出现的第二计数器的值进行比较。 如果第一计数器的值和第二计数器的值相等(或大于或等于),则更新代码字表以交换第一和第二数字媒体值之间的码字。

    Data record compression with progressive and/or selective decomposition
    64.
    发明授权
    Data record compression with progressive and/or selective decomposition 有权
    使用渐进和/或选择性分解的数据记录压缩

    公开(公告)号:US09025892B1

    公开(公告)日:2015-05-05

    申请号:US14557900

    申请日:2014-12-02

    申请人: QBASE, LLC

    摘要: Disclosed herein are systems and methods for compressing structured or semi-structured data in a horizontal manner achieving compression ratios similar to vertical compression. Collections include structured or semi-structured data include a number of fields and are described using a schema. Fields include information having semantic similarity and are compressed using methods suitable for compressing the type of data. Data of a collection is compressed after fragmentation or may be normalized prior to compression. Data with semantic similarity is compressed using token tables and/or n-gram tables, where higher weighted, consisting of the product of frequency and length, occurring values may be stored in the lower numbered indices of the data table. Records include record descriptor bytes, field descriptor bytes, zero or more array descriptor bytes, zero or more object descriptor bytes, or bytes representing the data associated with the record. Data is indexed or compressed by a suitable module.

    摘要翻译: 本文公开了用于以水平方式压缩结构化或半结构化数据的系统和方法,其实现类似于垂直压缩的压缩比。 集合包括结构化或半结构化数据包括多个字段,并使用模式进行描述。 字段包括具有语义相似性的信息,并使用适合于压缩数据类型的方法进行压缩。 集合的数据在分段之后被压缩,或者可以在压缩之前被归一化。 使用令牌表和/或n-gram表压缩具有语义相似性的数据,其中由频率和长度的乘积组成的较高加权可以存储在数据表的较低编号的索引中。 记录包括记录描述符字节,字段描述符字节,零个或多个数组描述符字节,零个或多个对象描述符字节或表示与记录相关联的数据的字节。 数据由合适的模块索引或压缩。

    Real-time multi-block lossless recompression
    65.
    发明授权
    Real-time multi-block lossless recompression 有权
    实时多块无损重新压缩

    公开(公告)号:US08898337B2

    公开(公告)日:2014-11-25

    申请号:US13282991

    申请日:2011-10-27

    摘要: Exemplary methods, computer systems, and computer program products for processing a previously compressed data stream in a computer environment are provided. In one embodiment, the computer environment is configured for separating a previously compressed data stream into an input data block including a header input block having a previously compressed header. Sequences of bits are included with the input data block. Compression scheme information is derived from the previously compressed header. The input data block is accessed and recompressed following the header input block in the previously compressed data stream one at a time using block-image synchronization information. Access to the block-image synchronization information is initialized by the compression scheme information to generate an output data block. The block-image synchronization information is used to provide decompression information to facilitate decompression of the results of the output data block.

    摘要翻译: 提供了用于在计算机环境中处理先前压缩的数据流的示例性方法,计算机系统和计算机程序产品。 在一个实施例中,计算机环境被配置为将先前压缩的数据流分离成包括具有先前压缩的报头的报头输入块的输入数据块。 位的序列包含在输入数据块中。 压缩方案信息是从先前压缩的报头导出的。 在先前压缩的数据流中的标题输入块之后,使用块图像同步信息一次一个地访问和重新压缩输入数据块。 通过压缩方案信息初始化对块图像同步信息的访问,以生成输出数据块。 块图像同步信息用于提供解压缩信息以便于解压缩输出数据块的结果。

    Variable bit-length reiterative lossless compression system and method
    66.
    发明授权
    Variable bit-length reiterative lossless compression system and method 有权
    可变位长度无损压缩系统及方法

    公开(公告)号:US08878705B1

    公开(公告)日:2014-11-04

    申请号:US14229515

    申请日:2014-03-28

    发明人: Sidney Dunayer

    IPC分类号: H03M7/40

    摘要: A computer-implemented method of performing lossless compression of a digital data set uses an iterative compression process in which the number of symbols N and bit length per symbol n may vary on successive iterations. The process includes analyzing at least a part of the data set to establish a partition thereof into N symbols of symbol length n, and to determine whether the N symbols can be further compressed, and, if so, a model to be used in encoding the N symbols.

    摘要翻译: 执行数字数据集的无损压缩的计算机实现的方法使用迭代压缩处理,其中符号N的数量和每符号n的位长度可以在连续的迭代中变化。 该过程包括分析数据集的至少一部分以将其分区建立为符号长度为n的N个符号,并且确定是否可以进一步压缩N个符号,并且如果是,则在编码中使用的模型 N个符号。

    Data processing apparatus and method
    67.
    发明授权
    Data processing apparatus and method 有权
    数据处理装置及方法

    公开(公告)号:US08854239B2

    公开(公告)日:2014-10-07

    申请号:US13769340

    申请日:2013-02-17

    IPC分类号: H03M7/30

    CPC分类号: H03M7/30 H03M7/6088

    摘要: A data processing apparatus and a data processing method thereof are provided. The data processing apparatus includes a register and a processor electrically connected to the register. The register is stored with a plurality of data. The plurality of data each includes a first sub-datum and a second sub-datum. The plurality of first sub-data corresponds to a first column and the plurality of second sub-data corresponds to a second column. The processor compresses the first sub-data by a first compression algorithm according to a first characteristic of the plurality of first sub-data and compresses the second sub-data by a second compression algorithm according to a second characteristic of the plurality of second sub-data.

    摘要翻译: 提供了一种数据处理装置及其数据处理方法。 数据处理装置包括与寄存器电连接的寄存器和处理器。 寄存器与多个数据一起存储。 多个数据每个包括第一子数据和第二子数据。 多个第一子数据对应于第一列,并且多个第二子数据对应于第二列。 处理器根据多个第一子数据的第一特性通过第一压缩算法对第一子数据进行压缩,并根据多个第二子数据的第二特性通过第二压缩算法压缩第二子数据, 数据。

    DATA PROCESSING APPARATUS AND METHOD
    68.
    发明申请
    DATA PROCESSING APPARATUS AND METHOD 有权
    数据处理装置和方法

    公开(公告)号:US20140145866A1

    公开(公告)日:2014-05-29

    申请号:US13769340

    申请日:2013-02-17

    IPC分类号: H03M7/30

    CPC分类号: H03M7/30 H03M7/6088

    摘要: A data processing apparatus and a data processing method thereof are provided. The data processing apparatus includes a register and a processor electrically connected to the register. The register is stored with a plurality of data. The plurality of data each includes a first sub-datum and a second sub-datum. The plurality of first sub-data corresponds to a first column and the plurality of second sub-data corresponds to a second column. The processor compresses the first sub-data by a first compression algorithm according to a first characteristic of the plurality of first sub-data and compresses the second sub-data by a second compression algorithm according to a second characteristic of the plurality of second sub-data.

    摘要翻译: 提供了一种数据处理装置及其数据处理方法。 数据处理装置包括与寄存器电连接的寄存器和处理器。 寄存器与多个数据一起存储。 多个数据每个包括第一子数据和第二子数据。 多个第一子数据对应于第一列,并且多个第二子数据对应于第二列。 处理器根据多个第一子数据的第一特性通过第一压缩算法对第一子数据进行压缩,并根据多个第二子数据的第二特性通过第二压缩算法压缩第二子数据, 数据。

    Method and apparatus for compressing and decompressing block unit data
    69.
    发明授权
    Method and apparatus for compressing and decompressing block unit data 有权
    用于压缩和解压缩块单元数据的方法和装置

    公开(公告)号:US08593312B2

    公开(公告)日:2013-11-26

    申请号:US13380751

    申请日:2010-08-25

    申请人: Yun-Sik Oh

    发明人: Yun-Sik Oh

    IPC分类号: H03M7/30

    CPC分类号: H03M7/3091 H03M7/6088

    摘要: An apparatus for compressing and decompressing data is disclosed. The apparatus for compressing data includes a block setting unit that divides data of at least one original file into two or more blocks, a compression unit that generates block compression data by applying a compression algorithm to data corresponding to at least one block among blocks divided by the block setting unit, and a compression file generation unit that generates a block header and the block body of the block for each block divided by the block setting unit, in which the block body includes the block compression data if the block is compressed by the compression unit or includes the original data of the block if the block is not compressed the by compression unit.

    摘要翻译: 公开了一种用于压缩和解压缩数据的装置。 用于压缩数据的装置包括将至少一个原始文件的数据划分成两个或更多个块的块设置单元,压缩单元,其通过对与至少一个块中的至少一个块相对应的数据应用压缩算法来生成块压缩数据, 块设置单元,以及压缩文件生成单元,其生成块标题和块分块的块的块体,其中块体包括块压缩数据,如果块被压缩由块设置单元压缩 压缩单元或包括块的原始数据,如果块未被压缩单元压缩。