Data encryption in a two-tier storage system

    公开(公告)号:US11379383B2

    公开(公告)日:2022-07-05

    申请号:US17002649

    申请日:2020-08-25

    Applicant: VMware, Inc.

    Abstract: A method for encrypting data blocks is provided. The method receives a plurality of data blocks and encrypts each data block using an LBA of the data block as a tweak. The method writes the plurality of encrypted data blocks to physical blocks of the plurality of physical disks. The method then performs deduplication on the physical disks by determining that first and second physical blocks in the physical disks are duplicates, decrypting encrypted data in the first physical block using a first LBA associated with the first physical block as the tweak, and re-encrypting decrypted data in the first physical block using a PBA associated with the first physical block as the tweak. When reading the data back, either the LBA or PBA is used as the tweak, depending on whether the data was encrypted using LBA or re-encrypted using PBA during the deduplication process.

    Tiering Data to a Cold Storage Tier of Cloud Object Storage

    公开(公告)号:US20220066882A1

    公开(公告)日:2022-03-03

    申请号:US17002577

    申请日:2020-08-25

    Applicant: VMware, Inc.

    Abstract: Techniques for tiering data to a cold storage tier of a cloud object storage platform are provided. In one set of embodiments, a computer system can identify one or more old snapshots of a data set that reside in a first storage tier of the cloud object storage platform, where the one or more old snapshots are snapshots that are unlikely to be deleted from the cloud object storage platform within a period of N days. The computer system can further, for each snapshot in the one or more old snapshots: identify one or more data blocks in the snapshot that are superseded by a more recent snapshot in the one or more old snapshots; write the one or more data blocks to a second (i.e., cold) storage tier of the cloud object storage platform that has a lower storage cost than the first storage tier; and cause the one or more data blocks to be deleted from the first storage tier.

    ENHANCING EFFICIENCY OF SEGMENT CLEANING FOR A LOG-STRUCTURED FILE SYSTEM

    公开(公告)号:US20220058161A1

    公开(公告)日:2022-02-24

    申请号:US16999994

    申请日:2020-08-21

    Applicant: VMware, Inc.

    Abstract: The efficiency of segment cleaning for a log-structured file system (LFS) is enhanced at least by storing additional information in a segment usage table (SUT). Live blocks (representing portions of stored objects) in an LFS are determined based at least on the SUT. Chunk identifiers associated with the live blocks are read. The live blocks are coalesced at least by writing at least a portion of the live blocks into at least one new segment. A blind update of at least a portion of the chunk identifiers in a chunk map is performed to indicate the new segment. The blind update includes writing to the chunk map without reading from the chunk map. In some examples, the objects comprise virtual machine disks (VMDKs) and the SUT changes between a list format and a bitmap format, to minimize size.

    Managing metadata for a backup data storage

    公开(公告)号:US11221944B1

    公开(公告)日:2022-01-11

    申请号:US17002667

    申请日:2020-08-25

    Applicant: VMware, Inc.

    Abstract: A method for managing metadata for data stored in a cloud storage is provided. The method receives, at a first of a plurality of metadata servers, information associated with an object stored in the cloud storage, the information comprising a plurality of LBAs for where the object is stored. Each metadata server allocates contiguous chunk IDs for a group of objects. The method generates a new chunk ID for the object, which is a combination of a unique fixed value and a monotonically incrementing local value associated with each LBA, such that a first LBA is mapped to a first chunk ID having a first local value and a next LBA is mapped to a second chunk ID having the first local value incremented as a second local value. The method stores the new chunk ID and other metadata in one or more tables stored in a metadata storage.

    Enhanced hash calculation in distributed datastores

    公开(公告)号:US11204706B2

    公开(公告)日:2021-12-21

    申请号:US16827648

    申请日:2020-03-23

    Applicant: VMware, Inc.

    Abstract: A method for generating one or more hashes for one or more data blocks is provided. The method receives a data block to write on at least one physical disk of a set of physical disks associated with a set of host machines. The method then calculates a hash for the received data block and writes a first entry to a data log in a cache disk, the first entry comprising a first header and data indicative of the received block, the first header comprising the hash. The method further writes the data to the at least one physical disk as part of data blocks of a stripe, and stores the hash in a summary block on the at least one physical disk. The summary block is associated with the data blocks of the stripe stored on the at least one physical disk.

    Systems and methods of resyncing data in erasure-coded objects with multiple failures

    公开(公告)号:US11182250B1

    公开(公告)日:2021-11-23

    申请号:US16920005

    申请日:2020-07-02

    Applicant: VMware, Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resynchronizing data in a storage system. One of the methods includes determining that a particular disk of a capacity object of a storage system is out-of-sync and that a primary disk is unavailable; and for each segment of one or more segments of the capacity object: generating a first version of the column of the segment corresponding to the unavailable primary disk; determining whether the data integrity token in the column summary of the generated first version is valid; and in response to determining that the data integrity token is valid, resynchronizing the column of the segment corresponding to the particular disk using i) the primary columns of the segment corresponding to each available primary disk and ii) the first version of the column of the segment corresponding to the unavailable primary disk.

    SUPPORTING DISTRIBUTED AND LOCAL OBJECTS USING A MULTI-WRITER LOG-STRUCTURED FILE SYSTEM

    公开(公告)号:US20210334236A1

    公开(公告)日:2021-10-28

    申请号:US16857517

    申请日:2020-04-24

    Applicant: VMware, Inc.

    Abstract: Supporting distributed and local objects using a multi-writer log-structured file system (LFS) includes, on a node, receiving incoming data from each of a plurality of local objects; coalescing the received data; determining whether the coalesced data comprises a full segment of data; based at least on the coalesced incoming data comprises a full segment, writing at least a first portion of the coalesced data to a first storage of the LFS, wherein the coalesced data comprises the first portion and a remainder portion; writing the remainder portion to a second storage of the LFS; acknowledging the writing to the objects; determining whether at least a full segment of data has accumulated in the second storage; based at least on determining that at least a full segment has accumulated, writing at least a portion of the accumulated data as one or more full segments of data to the first storage.

Patent Agency Ranking