-
公开(公告)号:US20240362189A1
公开(公告)日:2024-10-31
申请号:US18768606
申请日:2024-07-10
Applicant: AtomBeam Technologies Inc.
Inventor: Joshua Cooper , Charles Yeomans
IPC: G06F16/174 , G06F3/06
CPC classification number: G06F16/1752 , G06F3/0608 , G06F3/0641 , G06F3/067
Abstract: A system and method for random-access manipulation of compacted data files, utilizing a reference codebook, a random-access engine, a data deconstruction engine, and a data deconstruction engine. The system may receive a data query pertaining to a data read or data write request, wherein the data file to be read from or written to is a compacted data file. A random-access engine may facilitate data manipulation processes by transforming the codebook into a hierarchical representation and then traversing the representation scanning for specific codewords associated with a data query request. In an embodiment, an estimator module is present and configured to utilize cardinality estimation to determine a starting codeword to begin searching the compacted data file for the data associated with the data query. The random-access engine may encode the data to be written, insert the encoded data into a compacted data file, and update the codebook as needed.
-
公开(公告)号:US20240311342A1
公开(公告)日:2024-09-19
申请号:US18183659
申请日:2023-03-14
Applicant: Cohesity, Inc.
Inventor: Aiswarya Bhavani Shankar , Dane Van Dyck , Venkata Ranga Radhanikanth Guturi , Leo Prasath Arulraj
IPC: G06F16/174 , G06F11/14
CPC classification number: G06F16/1752 , G06F11/1453 , G06F2201/84
Abstract: Techniques are described for selectively extending a WORM lock expiration time for a chunkfile. An example method comprises identifying, by a data platform implemented by a computing system, a chunkfile that includes a chunk that matches data for an object of a file system; determining, by the data platform after identifying the chunkfile, whether to deduplicate the data for the object of the file system by adding a reference to the matching chunk, wherein determining whether to deduplicate the data comprises applying a policy to at least one of a property of the chunkfile or properties of one or more of a plurality of chunks included in the chunkfile; and in response to determining to not deduplicate the data for the object of the file system, causing a new chunk for the data for the object of the file system to be stored in a different, second chunkfile.
-
公开(公告)号:US12079168B2
公开(公告)日:2024-09-03
申请号:US18460676
申请日:2023-09-04
Applicant: AtomBeam Technologies Inc.
Inventor: Joshua Cooper , Aliasghar Riahi , Mojgan Haddad , Ryan Kourosh Riahi , Razmin Riahi , Charles Yeomans
IPC: G06F16/174 , G06F3/06
CPC classification number: G06F16/1752 , G06F3/0608 , G06F3/0641 , G06F3/067
Abstract: A system and method for error-resilient data reduction, utilizing a phase detector, a data requestor, a multi-phase trainer, a reconstruction engine, a deconstruction engine, and one or more reference codebooks. A multi-phase trainer may be used to train the reconstruction and deconstruction engines on various phase sourceblocks in order recover quickly from corrupted data files that cause the phase alignment of the sourceblocks to become out of phase. A phase detector may determine when the sourceblocks get out of phase and when the return to in-phase by checking if a predetermined threshold probability of correct encoding is met. Data requestor may request for retransmission only the data that was received out of phase.
-
公开(公告)号:US12007949B2
公开(公告)日:2024-06-11
申请号:US17623081
申请日:2021-07-22
Inventor: Myung Keun Yoon , Jun Nyung Hur , Hyeon Gy Shon
IPC: G06F16/174 , G06F16/13 , G06F16/14 , H04L47/43
CPC classification number: G06F16/1752 , G06F16/13 , G06F16/148
Abstract: An apparatus for detecting a target file includes an inverse indexing database unit configured to generate at least one file chunk by performing a chunking operation on a target file, and inversely index each of the at least one file chunk as a target file code, a network packet receiving unit configured to receive a network packet, a packet chunk processing unit configured to generate at least one packet chunk by performing a chunking operation on a network packet, a chunk query unit configured to generate a packet chunk query word for each of the at least one packet chunk and provide the packet chunk query word to the inverse indexing database unit to receive the detection target file code, and a file code determining unit configured to determine a most likely detection target file code in the network packet based on the received detection target file code.
-
公开(公告)号:US12001391B2
公开(公告)日:2024-06-04
申请号:US17476876
申请日:2021-09-16
Applicant: Cohesity, Inc.
Inventor: Praveen Kumar Yarlagadda , Aiswarya Bhavani Shankar , Venkata Ranga Radhanikanth Guturi , Anubhav Gupta
IPC: G06F16/10 , G06F11/14 , G06F16/11 , G06F16/174
CPC classification number: G06F16/125 , G06F11/1451 , G06F16/1752
Abstract: An indication to store to a remote storage a new archive of a snapshot of a source storage is received. At least one shared data chunk of the new archive is determined to be already stored in an existing chunk object of the remote storage storing data chunks of a previous archive. One or more evaluation metrics for the existing chunk object are determined based at least in part on a retention period associated with one or more individual chunks stored in the chunk object and a data lock period associated with the entire existing chunk object. It is determined based on the one or more evaluation metrics whether to reference the at least one shared data chunk of the new archive from the existing chunk object or store the at least one shared data chunk in a new chunk object of the remote storage.
-
公开(公告)号:US11940956B2
公开(公告)日:2024-03-26
申请号:US16372675
申请日:2019-04-02
Applicant: John Butt
Inventor: John Butt
IPC: G06F16/17 , G06F3/06 , G06F16/13 , G06F16/174
CPC classification number: G06F16/1752 , G06F3/0608 , G06F3/0622 , G06F3/0641 , G06F3/0683 , G06F16/13
Abstract: Examples may include container index persistent item tags. Examples may store chunk signatures in at least one container index and, for each chunk signature, store at least one persistent item tag identifying a respective backup item that references or formerly referenced the chunk signature. Examples may determine that all chunks formerly referenced by a backup item have been erased based on the persistent item tags in the at least one container index and output an indication that the backup item has been erased.
-
公开(公告)号:US11899624B2
公开(公告)日:2024-02-13
申请号:US18078909
申请日:2022-12-09
Applicant: AtomBeam Technologies Inc.
Inventor: Aliasghar Riahi , Joshua Cooper , Mojgan Haddad , Charles Yeomans
IPC: G06F16/174 , G06F3/06
CPC classification number: G06F16/1752 , G06F3/067 , G06F3/0608 , G06F3/0641
Abstract: A system and method for random-access manipulation of compacted data files, utilizing a reference codebook, a random-access engine, a data deconstruction engine, and a data deconstruction engine. The system may receive a data query pertaining to a data read or data write request, wherein the data file to be read from or written to is a compacted data file. A random-access engine may facilitate data manipulation processes by accessing a reference codebook associated with the compacted data file, a frequency table used to construct the reference codebook, and data query details. A data read request is supported by random-access search capabilities that may enable the locating and decoding of the bits corresponding to data query details. A random-access engine facilitates data write processes. The random-access engine may encode the data to be written, insert the encoded data into a compacted data file, and update the codebook as needed.
-
公开(公告)号:US11853262B2
公开(公告)日:2023-12-26
申请号:US17994359
申请日:2022-11-27
Applicant: AtomBeam Technologies Inc.
Inventor: Joshua Cooper , Aliasghar Riahi , Mojgan Haddad , Ryan Kourosh Riahi , Razmin Riahi , Charles Yeomans
IPC: G06F16/174 , G06F3/06
CPC classification number: G06F16/1752 , G06F3/067 , G06F3/0608 , G06F3/0641
Abstract: A system and method for file type identification involving extraction of a file-print of a file, the file-print being a unique or practically-unique representation of statistical characteristics associated with the distribution of bits in the binary contents of the file, similar to a fingerprint. The file-print is then passed to a machine learning algorithm that has been trained to recognize file types from their file-prints. The machine learning algorithm returns a predicted file type and, in some cases, a probability of correctness of the prediction. The file may then be encoded using an encoding algorithm chosen based on the predicted file type.
-
9.
公开(公告)号:US11829331B2
公开(公告)日:2023-11-28
申请号:US17695652
申请日:2022-03-15
Applicant: Commvault Systems, Inc.
Inventor: Jun H. Ahn , Prasad Nara , Sri Karthik Bhagi
IPC: G06F16/17 , G06F3/06 , G06F16/174 , G06F16/172
CPC classification number: G06F16/1734 , G06F3/065 , G06F3/0607 , G06F3/0638 , G06F3/0649 , G06F3/0683 , G06F16/172 , G06F16/1752
Abstract: A system for performing continuous transaction log backups with minimal resource usage of the client computing devices that are processing the transactions is disclosed. The system detects at least one input/output (I/O) activity at a client computing device. The I/O activity can be associated with at least one database operation performed via the client computing device. The system then executes one or more native commands to backup transactions log data associated with the detected I/O activity to a virtualized location. Backing-up the transactions log data comprises dynamically identifying a mount path location corresponding to the virtualized location, and transferring the transactions log data to the dynamically identified mount path using the one or more native commands. The system can then perform data processing operations (for example, data chunking and deduplicating) on the transactions log data after it is received at the dynamically identified mount path location.
-
公开(公告)号:US11803515B2
公开(公告)日:2023-10-31
申请号:US17487615
申请日:2021-09-28
Applicant: International Business Machines Corporation
Inventor: Roderick Guy Charles Moore , Denis Alexander Frank , Lee Jason Sanders
IPC: G06F16/174 , G06F16/17
CPC classification number: G06F16/1724 , G06F16/1734 , G06F16/1752
Abstract: Disclosed are techniques for defragmentation in deduplication storage systems. Machine language determines using deduplication metadata that at least some of an incoming input/output stream is a duplicate of at least part of a source volume whose physical locations of its stored data are fragmented in backend storage. Subsequently, defragmentation is carried out on the stored data by using the incoming input/output stream to write the data into sequential chunks at new physical locations in the backend storage and updating the source volume location mappings to the new physical locations.
-
-
-
-
-
-
-
-
-