SYSTEM AND METHOD FOR RANDOM-ACCESS MANIPULATION OF COMPACTED DATA FILES

    公开(公告)号:US20240362189A1

    公开(公告)日:2024-10-31

    申请号:US18768606

    申请日:2024-07-10

    CPC classification number: G06F16/1752 G06F3/0608 G06F3/0641 G06F3/067

    Abstract: A system and method for random-access manipulation of compacted data files, utilizing a reference codebook, a random-access engine, a data deconstruction engine, and a data deconstruction engine. The system may receive a data query pertaining to a data read or data write request, wherein the data file to be read from or written to is a compacted data file. A random-access engine may facilitate data manipulation processes by transforming the codebook into a hierarchical representation and then traversing the representation scanning for specific codewords associated with a data query request. In an embodiment, an estimator module is present and configured to utilize cardinality estimation to determine a starting codeword to begin searching the compacted data file for the data associated with the data query. The random-access engine may encode the data to be written, insert the encoded data into a compacted data file, and update the codebook as needed.

    ADAPTIVE DEDUPLICATION OF DATA CHUNKS
    2.
    发明公开

    公开(公告)号:US20240311342A1

    公开(公告)日:2024-09-19

    申请号:US18183659

    申请日:2023-03-14

    Applicant: Cohesity, Inc.

    CPC classification number: G06F16/1752 G06F11/1453 G06F2201/84

    Abstract: Techniques are described for selectively extending a WORM lock expiration time for a chunkfile. An example method comprises identifying, by a data platform implemented by a computing system, a chunkfile that includes a chunk that matches data for an object of a file system; determining, by the data platform after identifying the chunkfile, whether to deduplicate the data for the object of the file system by adding a reference to the matching chunk, wherein determining whether to deduplicate the data comprises applying a policy to at least one of a property of the chunkfile or properties of one or more of a plurality of chunks included in the chunkfile; and in response to determining to not deduplicate the data for the object of the file system, causing a new chunk for the data for the object of the file system to be stored in a different, second chunkfile.

    Apparatus and method for detecting target file based on network packet analysis

    公开(公告)号:US12007949B2

    公开(公告)日:2024-06-11

    申请号:US17623081

    申请日:2021-07-22

    CPC classification number: G06F16/1752 G06F16/13 G06F16/148

    Abstract: An apparatus for detecting a target file includes an inverse indexing database unit configured to generate at least one file chunk by performing a chunking operation on a target file, and inversely index each of the at least one file chunk as a target file code, a network packet receiving unit configured to receive a network packet, a packet chunk processing unit configured to generate at least one packet chunk by performing a chunking operation on a network packet, a chunk query unit configured to generate a packet chunk query word for each of the at least one packet chunk and provide the packet chunk query word to the inverse indexing database unit to receive the detection target file code, and a file code determining unit configured to determine a most likely detection target file code in the network packet based on the received detection target file code.

    Managing objects stored at a remote storage

    公开(公告)号:US12001391B2

    公开(公告)日:2024-06-04

    申请号:US17476876

    申请日:2021-09-16

    Applicant: Cohesity, Inc.

    CPC classification number: G06F16/125 G06F11/1451 G06F16/1752

    Abstract: An indication to store to a remote storage a new archive of a snapshot of a source storage is received. At least one shared data chunk of the new archive is determined to be already stored in an existing chunk object of the remote storage storing data chunks of a previous archive. One or more evaluation metrics for the existing chunk object are determined based at least in part on a retention period associated with one or more individual chunks stored in the chunk object and a data lock period associated with the entire existing chunk object. It is determined based on the one or more evaluation metrics whether to reference the at least one shared data chunk of the new archive from the existing chunk object or store the at least one shared data chunk in a new chunk object of the remote storage.

    Container index persistent item tags

    公开(公告)号:US11940956B2

    公开(公告)日:2024-03-26

    申请号:US16372675

    申请日:2019-04-02

    Applicant: John Butt

    Inventor: John Butt

    Abstract: Examples may include container index persistent item tags. Examples may store chunk signatures in at least one container index and, for each chunk signature, store at least one persistent item tag identifying a respective backup item that references or formerly referenced the chunk signature. Examples may determine that all chunks formerly referenced by a backup item have been erased based on the persistent item tags in the at least one container index and output an indication that the backup item has been erased.

    System and method for random-access manipulation of compacted data files

    公开(公告)号:US11899624B2

    公开(公告)日:2024-02-13

    申请号:US18078909

    申请日:2022-12-09

    CPC classification number: G06F16/1752 G06F3/067 G06F3/0608 G06F3/0641

    Abstract: A system and method for random-access manipulation of compacted data files, utilizing a reference codebook, a random-access engine, a data deconstruction engine, and a data deconstruction engine. The system may receive a data query pertaining to a data read or data write request, wherein the data file to be read from or written to is a compacted data file. A random-access engine may facilitate data manipulation processes by accessing a reference codebook associated with the compacted data file, a frequency table used to construct the reference codebook, and data query details. A data read request is supported by random-access search capabilities that may enable the locating and decoding of the bits corresponding to data query details. A random-access engine facilitates data write processes. The random-access engine may encode the data to be written, insert the encoded data into a compacted data file, and update the codebook as needed.

Patent Agency Ranking