-
公开(公告)号:US20210311919A1
公开(公告)日:2021-10-07
申请号:US16842657
申请日:2020-04-07
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Eric Knauft
IPC: G06F16/215 , H04L29/08 , G06F16/23 , G06F12/0804
Abstract: Techniques for reducing data log recovery time and metadata write amplification when checkpointing a data log of a storage object in a distributed storage system are provided. In one set of embodiments, a node of the system can determine whether the data log has reached a first threshold size, where the data log comprises a plurality of data log records, and where each data log record includes data and metadata for a write request directed to the storage object. If the data log has reached the first threshold size, the node can copy, from each of the plurality of data log records, the metadata for the write request to a corresponding metadata log entry in a metadata log of the storage object. The node can then truncate the data log by removing the plurality of data log records.
-
公开(公告)号:US11093464B1
公开(公告)日:2021-08-17
申请号:US16857574
申请日:2020-04-24
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu
IPC: G06F16/00 , G06F16/215 , G06F16/22
Abstract: Solutions are disclosed for blocks in a multi-writer log-structured file system. Solutions include selecting candidate segments in a storage medium; reading blocks of the candidate segments; determining whether any blocks are duplicates; updating a reference count for the duplicate blocks; identifying unique blocks; writing at least a portion of the unique blocks to a log; determining whether the log has accumulated a full segment of data; based at least on determining that the log has accumulated a full segment of data, writing the full segment to the storage medium; updating a segment usage table (SUT) to mark the candidate segments as free; and updating the SUT to mark a segment of the storage medium as no longer free. Some examples identify a window start time and stop time, because older segments have been deduped and younger segments may be volatile. Some examples adjust the window to improve performance.
-
公开(公告)号:US11741005B2
公开(公告)日:2023-08-29
申请号:US17951018
申请日:2022-09-22
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Junlong Gao
CPC classification number: G06F12/0253 , G06F11/1048 , G06F11/2056 , G06F16/27
Abstract: Techniques for using data mirroring across regions to reduce the likelihood of losing objects in a cloud object storage platform are provided. In one set of embodiments, a computer system can upload first and second copies of a data object to first and second regions of the cloud object storage platform respectively, where the first and second copies are identical. The computer system can then attempt to read the first copy of the data object from the first region. If the read attempt fails, the computer system can retrieve the second copy of the data object from the second region.
-
公开(公告)号:US11675745B2
公开(公告)日:2023-06-13
申请号:US17097473
申请日:2020-11-13
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Junlong Gao , Vamsi Gunturu
IPC: G06F16/18 , G06F16/188 , G06F16/182 , G06F16/901 , G06F9/455 , G06F11/14 , G06F16/16
CPC classification number: G06F16/1805 , G06F9/45558 , G06F11/1484 , G06F16/164 , G06F16/188 , G06F16/1824 , G06F16/9027 , G06F2009/45591
Abstract: A method for managing data associated with objects stored in a cloud storage is provided. The method receives, at a first compute node, first data associated with an object stored in the cloud storage, the first compute node being one of a plurality of compute nodes that store data associated with different objects as storage objects in a log-structured merging (LSM) tree data structure. The method then assigns a first unique name to a first storage object associated with the first data, the first unique name comprising a combination of at least an identifier identifying the first compute node and a first incremental local value. The method stores the first storage object in a first level (L0) of the LSM tree data structure.
-
35.
公开(公告)号:US11573711B2
公开(公告)日:2023-02-07
申请号:US16827692
申请日:2020-03-23
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Eric Knauft , Vamsi Gunturu , Pascal Renauld
Abstract: A method for encrypting data in one or more data blocks is provided. The method receives a first data block to be written to a physical storage that includes one or more physical disks. The method applies a first random tweak to data indicative of the first data block to generate a first encrypted data block, and writes the first encrypted data block and the first random tweak to a first physical block of the physical storage. The method receives a second data block to be written to the physical storage. The method then applies a second random tweak, different than the first random tweak, to data indicative of the second data block to generate a second encrypted data block, and writes the second encrypted data block and the second random tweak to a second physical block of the physical storage.
-
公开(公告)号:US11556423B2
公开(公告)日:2023-01-17
申请号:US16882246
申请日:2020-05-22
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Junlong Gao
Abstract: Techniques for using erasure coding in a single region to reduce the likelihood of losing objects in a cloud object storage platform are provided. In one set of embodiments, a computer system can upload a plurality of data objects to a region of a cloud object storage platform, where the plurality of data objects including modifications to a data set. The computer system can further compute a parity object based on the plurality of data objects, where the parity object encodes parity information for the plurality of data objects. The computer system can then upload the parity object to the same region where the plurality of data objects was uploaded.
-
公开(公告)号:US11500819B2
公开(公告)日:2022-11-15
申请号:US17028405
申请日:2020-09-22
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Junlong Gao , Maxime Austruy , Petr Vandrovec , Ilya Languev , Ilia Sokolinski , Satish Pudi
IPC: G06F7/00 , G06F16/174 , G06F16/14 , G06F16/13
Abstract: The present disclosure is related to methods, systems, and machine-readable media for supporting deduplication in file storage using file chunk hashes. A hash of a chunk of a log segment can be received from a software defined data center. A chunk identifier can be associated with the hash in a hash map that stores associations between sequentially-allocated chunk identifiers and hashes. The chunk identifier can be associated with a logical address corresponding to the chunk of the log segment in a logical map that stores associations between the sequentially-allocated chunk identifiers and logical addresses. A search of the hash map can be performed to determine if the chunk is a duplicate, and the chunk can be deduplicated responsive to a determination that the chunk is a duplicate.
-
公开(公告)号:US11429498B2
公开(公告)日:2022-08-30
申请号:US16870861
申请日:2020-05-08
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Enning Xiang , Eric Knauft
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resynchronizing data in a storage system. One of the methods includes determining that a particular disk of a capacity object of a storage system was offline for an interval of time, wherein the capacity object comprises a plurality of segments, and wherein the storage system comprises a segment usage table identifying a linked list of particular segments of the capacity object that are in use; determining a time point at which the particular disk went offline; determining one or more first segments of the capacity object that were modified after the time point, wherein determining one or more first segments comprises determining each segment of the segment usage table having a transaction ID that is larger than the time point; and resynchronizing, for each first segment, a portion of the particular disk corresponding to the first segment.
-
公开(公告)号:US20220091765A1
公开(公告)日:2022-03-24
申请号:US17028312
申请日:2020-09-22
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Junlong Gao , Ilya Languev , Petr Vandrovec , Maxime Austruy , Ilia Sokolinski , Satish Pudi
IPC: G06F3/06 , G06F12/1018
Abstract: The present disclosure is related to methods, systems, and machine-readable media for supporting deduplication in object storage using subset hashes. A plurality of hashes of a plurality of blocks of a plurality of log segments can be received from a software defined data center, wherein each block corresponds to a respective logical address. Each of the plurality of logical addresses can be associated with a respective sequentially-allocated chunk identifier in a logical map. A subset hash comprising a hash of a subset of the plurality of blocks can be determined that corresponds to a contiguous range of the plurality of logical addresses. A search of a hash map for the subset hash can be performed to determine if the subset hash is a duplicate. The subset of the plurality of blocks can be deduplicated responsive to a determination that the subset hash is a duplicate.
-
公开(公告)号:US20220066883A1
公开(公告)日:2022-03-03
申请号:US17002669
申请日:2020-08-25
Applicant: VMware, Inc.
Inventor: Wenguang Wang , Vamsi Gunturu , Junlong Gao , Petr Vandrovec , Ilya Languev , Maxime Austruy , Ilia Sokolinski , Satish Pudi
Abstract: Techniques for recovering metadata associated with data backed up in cloud object storage are provided. In one set of embodiments, a computer system can create a snapshot of a data set, where the snapshot includes a plurality of data blocks of the data set that have been modified since the creation of a prior snapshot of the data set. The computer system can further upload the snapshot to a cloud object storage platform of a cloud infrastructure, where the snapshot is uploaded as a plurality of log segments conforming to an object format of the cloud object storage platform, and where each log segment includes one or more data blocks in the plurality of data blocks, and a set of metadata comprising, for each of the one or more data blocks, an identifier of the data set, an identifier of the snapshot, and a logical block address (LBA) of the data block. The computer system can then communicate the set of metadata to a server component running in a cloud compute and block storage platform of the cloud infrastructure.
-
-
-
-
-
-
-
-
-