SYSTEM AND METHOD FOR DATA COMPACTION UTILIZING MISMATCH PROBABILITY ESTIMATION

    公开(公告)号:US20220391099A1

    公开(公告)日:2022-12-08

    申请号:US17884470

    申请日:2022-08-09

    Abstract: A system and method for encoding data utilizing mismatch probability estimates. A training data set can be statistically analyzed to calculate a mismatch probability estimate which is the estimated frequency at which a data packet received during system runtime is not part of (i.e., a mismatch) the training data set. A plurality of tokens may be created, based on the mismatch probability estimate, to represent potential mismatched data that may be encountered during runtime, and an entropy encoder may generate codewords for the tokens using the mismatch probability estimate. An opcode, indicating a mismatch, may be generated and appended to the generated codewords to form a mismatch codeword. During runtime when a mismatch occurs, the system can retrieve a mismatch codeword and assign it to the mismatched data, making the encoder system robust against previously unencountered data.

    SYSTEM AND METHOD FOR DATA COMPACTION AND SECURITY USING MULTIPLE ENCODING ALGORITHMS

    公开(公告)号:US20220382458A1

    公开(公告)日:2022-12-01

    申请号:US17727913

    申请日:2022-04-25

    Abstract: A system and method for encoding data using a plurality of encoding libraries. Portions of the data are encoded by different encoding libraries, depending on which library provides the greatest compaction for a given portion of the data. This methodology not only provides substantial improvements in data compaction over use of a single data compaction algorithm with the highest average compaction, but provides substantial additional security in that multiple decoding libraries must be used to decode the data. In some embodiments, each portion of data may further be encoded using different sourceblock sizes, providing further security enhancements as decoding requires multiple decoding libraries and knowledge of the sourceblock size used for each portion of the data. In some embodiments, encoding libraries may be randomly or pseudo-randomly rotated to provide additional security.

    SYSTEM AND METHOD FOR RANDOM-ACCESS MANIPULATION OF COMPACTED DATA FILES

    公开(公告)号:US20220335014A1

    公开(公告)日:2022-10-20

    申请号:US17734052

    申请日:2022-04-30

    Abstract: A system and method for random-access manipulation of compacted data files, utilizing a reference codebook, a random-access engine, a data deconstruction engine, and a data deconstruction engine. The system may receive a data query pertaining to a data read or data write request, wherein the data file to be read from or written to is a compacted data file. A random-access engine may facilitate data manipulation processes by accessing a reference codebook associated with the compacted data file, a frequency table used to construct the reference codebook, and data query details. A data read request is supported by random-access search capabilities that may enable the locating and decoding of the bits corresponding to data query details. A random-access engine facilitates data write processes. The random-access engine may encode the data to be written, insert the encoded data into a compacted data file, and update the codebook as needed.

    System and method for data compaction and security using multiple encoding algorithms

    公开(公告)号:US11385794B2

    公开(公告)日:2022-07-12

    申请号:US17404699

    申请日:2021-08-17

    Abstract: A system and method for encoding data using a plurality of encoding libraries. Portions of the data are encoded by different encoding libraries, depending on which library provides the greatest compaction for a given portion of the data. This methodology not only provides substantial improvements in data compaction over use of a single data compaction algorithm with the highest average compaction, but provides substantial additional security in that multiple decoding libraries must be used to decode the data. In some embodiments, each portion of data may further be encoded using different sourceblock sizes, providing further security enhancements as decoding requires multiple decoding libraries and knowledge of the sourceblock size used for each portion of the data. In some embodiments, encoding libraries may be randomly or pseudo-randomly rotated to provide additional security.

    SYSTEM AND METHODS FOR BANDWIDTH-EFFICIENT ENCODING OF GENOMIC DATA

    公开(公告)号:US20220129421A1

    公开(公告)日:2022-04-28

    申请号:US17569500

    申请日:2022-01-05

    Abstract: A system and methods for bandwidth-efficient encoding of genome and bioinformatic sequence datasets comprising a sequence analyzer configured to: analyze a received sequence dataset to determine a sequence dataset file type, scan the sequence dataset to maintain a count of unique characters contained therein, identify positions where the unique character count increases by a power of two, deconstruct the sequence dataset into a plurality of sourceblocks at the identified positions, and encode the plurality of sourceblocks using a data deconstruction engine and library management module to assign each sourceblock a reference code.

    SYSTEM AND METHOD FOR DATA COMPACTION AND SECURITY USING MULTIPLE ENCODING ALGORITHMS

    公开(公告)号:US20210373776A1

    公开(公告)日:2021-12-02

    申请号:US17404699

    申请日:2021-08-17

    Abstract: A system and method for encoding data using a plurality of encoding libraries. Portions of the data are encoded by different encoding libraries, depending on which library provides the greatest compaction for a given portion of the data. This methodology not only provides substantial improvements in data compaction over use of a single data compaction algorithm with the highest average compaction, but provides substantial additional security in that multiple decoding libraries must be used to decode the data. In some embodiments, each portion of data may further be encoded using different sourceblock sizes, providing further security enhancements as decoding requires multiple decoding libraries and knowledge of the sourceblock size used for each portion of the data. In some embodiments, encoding libraries may be randomly or pseudo-randomly rotated to provide additional security.

    System and method for data compaction utilizing distributed codebook encoding

    公开(公告)号:US12236089B2

    公开(公告)日:2025-02-25

    申请号:US18490417

    申请日:2023-10-19

    Abstract: A system and method for data compaction utilizing distributed codebook encoding to improve entropy encoding methods to account for, and efficiently handle, previously-unseen data in data to be compacted, allow for distributed encoding and decoding capabilities, and allow for parametrized codebook encoding methods. Training data sets are analyzed to determine the frequency of occurrence of each sourceblock in the training data sets. A mismatch probability estimate is calculated comprising an estimated frequency at which any given data sourceblock received during encoding will not have a codeword in the codebook. Further, a codebook and a behavior codebook may both be maintained or altered in a distributed fashion across multiple devices or services, for widespread, or permission-based, or parametrized codebook encoding.

Patent Agency Ranking