Systems and methods of maintaining fault tolerance for new writes in degraded erasure coded distributed storage

    公开(公告)号:US11494090B2

    公开(公告)日:2022-11-08

    申请号:US17033610

    申请日:2020-09-25

    Applicant: VMware Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for maintaining fault tolerance for new writes in a storage system when one or more components of the storage system are unavailable. One of the methods includes determining that one or more first disks of a capacity object of a storage system are unavailable, wherein the storage system comprises a segment usage table identifying the plurality of segments of the capacity object; in response: identifying a plurality of available second disks, adding a plurality of new segments corresponding to the second disks to the capacity object, and adding data identifying the plurality of new segments to the segment usage table; and for each of one or more new write requests to the capacity object: identifying an available segment from the plurality of new segments, and writing data associated with the new write request to the identified available segment.

    System and method for speed up data rebuild in a distributed storage system with local deduplication

    公开(公告)号:US11474724B2

    公开(公告)日:2022-10-18

    申请号:US15880391

    申请日:2018-01-25

    Applicant: VMware, Inc.

    Abstract: A method includes obtaining a plurality of representations corresponding respectively to a plurality of blocks of data stored on a source node. A plurality of data pairs are sent to a destination node, where each data pair includes a logical address associated with a block of data from the plurality of blocks of data and the corresponding representation of the block of data. A determination is made whether the blocks of data associated with the respective logical addresses are duplicates of data stored on the destination node. In accordance with an affirmative determination, a reference to a physical address of the block of data stored on the destination node is stored. In accordance with a negative determination, an indication that the data corresponding to the respective logical address is not a duplicate is stored. The data indicated as not being a duplicate is written to the destination node.

    Supporting deduplication in object storage using subset hashes

    公开(公告)号:US11385817B2

    公开(公告)日:2022-07-12

    申请号:US17028312

    申请日:2020-09-22

    Applicant: VMware, Inc.

    Abstract: The present disclosure is related to methods, systems, and machine-readable media for supporting deduplication in object storage using subset hashes. A plurality of hashes of a plurality of blocks of a plurality of log segments can be received from a software defined data center, wherein each block corresponds to a respective logical address. Each of the plurality of logical addresses can be associated with a respective sequentially-allocated chunk identifier in a logical map. A subset hash comprising a hash of a subset of the plurality of blocks can be determined that corresponds to a contiguous range of the plurality of logical addresses. A search of a hash map for the subset hash can be performed to determine if the subset hash is a duplicate. The subset of the plurality of blocks can be deduplicated responsive to a determination that the subset hash is a duplicate.

    Data encryption in a two-tier storage system

    公开(公告)号:US11379383B2

    公开(公告)日:2022-07-05

    申请号:US17002649

    申请日:2020-08-25

    Applicant: VMware, Inc.

    Abstract: A method for encrypting data blocks is provided. The method receives a plurality of data blocks and encrypts each data block using an LBA of the data block as a tweak. The method writes the plurality of encrypted data blocks to physical blocks of the plurality of physical disks. The method then performs deduplication on the physical disks by determining that first and second physical blocks in the physical disks are duplicates, decrypting encrypted data in the first physical block using a first LBA associated with the first physical block as the tweak, and re-encrypting decrypted data in the first physical block using a PBA associated with the first physical block as the tweak. When reading the data back, either the LBA or PBA is used as the tweak, depending on whether the data was encrypted using LBA or re-encrypted using PBA during the deduplication process.

    Asynchronous unmap service
    96.
    发明授权

    公开(公告)号:US11360678B1

    公开(公告)日:2022-06-14

    申请号:US17179977

    申请日:2021-02-19

    Applicant: VMware, Inc.

    Abstract: In one set of embodiments, a computer system can periodically run an unmap service configured to scan a subset of bitmaps maintained by a file system of the computer system. As part of scanning each bitmap in the subset, the unmap service can, for each bit in the bitmap: (1) check whether the bit indicates that a corresponding physical block address (PBA) on the storage backend is currently free; (2) upon determining that the bit indicates the PBA is currently free, identify an extent within the bitmap where the PBA resides; (3) check whether an unmap indicator associated with the extent indicates that at least one free PBA in the extent is not currently unmapped in the storage backend; and (4) upon determining that the unmap indicator indicates at least one free PBA in the extent is not currently unmapped in the storage backend, add the PBA to a list of PBAs to be unmapped.

    Distributed transactions with redo-only write-ahead log

    公开(公告)号:US11294864B2

    公开(公告)日:2022-04-05

    申请号:US14716834

    申请日:2015-05-19

    Applicant: VMware, Inc.

    Inventor: Wenguang Wang

    Abstract: Examples perform transactions across a distributed system of elements, such as nodes, computing devices, objects, and virtual machines. The elements of the distributed system maintain data (e.g, tables) which include information on transactions previously received and the source of the transactions. A first element of the distributed system transmits a transaction, the identifier (ID) of the first element, and a transaction ID to a plurality of second elements. The second elements compare the transaction ID to the maximum transaction ID associated with the first element and stored in the tables to determine whether the transaction is the most recent and should be performed, or whether the transaction has already been performed and should not be re-performed. In this manner, undo logs are not needed.

    Tiering Data to a Cold Storage Tier of Cloud Object Storage

    公开(公告)号:US20220066882A1

    公开(公告)日:2022-03-03

    申请号:US17002577

    申请日:2020-08-25

    Applicant: VMware, Inc.

    Abstract: Techniques for tiering data to a cold storage tier of a cloud object storage platform are provided. In one set of embodiments, a computer system can identify one or more old snapshots of a data set that reside in a first storage tier of the cloud object storage platform, where the one or more old snapshots are snapshots that are unlikely to be deleted from the cloud object storage platform within a period of N days. The computer system can further, for each snapshot in the one or more old snapshots: identify one or more data blocks in the snapshot that are superseded by a more recent snapshot in the one or more old snapshots; write the one or more data blocks to a second (i.e., cold) storage tier of the cloud object storage platform that has a lower storage cost than the first storage tier; and cause the one or more data blocks to be deleted from the first storage tier.

    Efficient segment cleaning employing remapping of data blocks in log-structured file systems of distributed data systems

    公开(公告)号:US11262919B2

    公开(公告)日:2022-03-01

    申请号:US16914166

    申请日:2020-06-26

    Applicant: VMware, Inc.

    Abstract: Client data is structured as a set of data blocks. A first subset of data blocks is stored on a current segment of the disks. A second subset of data blocks is stored on a previous segment. A request to clean client data is received, including a request to update the current segment to include the second subset of data blocks. The second subset of data blocks is accessed and transmitted from a lower layer to a higher system layer. Parity data is generated at the higher layer. The parity data is transmitted to the lower layer. The lower layer updates second mapping data. In the updated mapping of the second mapping data, each local address that references a data block of the second subset of data blocks is included in the current segment of the plurality of disks. The lower layer writes the parity data in the current segment.

    ENHANCING EFFICIENCY OF SEGMENT CLEANING FOR A LOG-STRUCTURED FILE SYSTEM

    公开(公告)号:US20220058161A1

    公开(公告)日:2022-02-24

    申请号:US16999994

    申请日:2020-08-21

    Applicant: VMware, Inc.

    Abstract: The efficiency of segment cleaning for a log-structured file system (LFS) is enhanced at least by storing additional information in a segment usage table (SUT). Live blocks (representing portions of stored objects) in an LFS are determined based at least on the SUT. Chunk identifiers associated with the live blocks are read. The live blocks are coalesced at least by writing at least a portion of the live blocks into at least one new segment. A blind update of at least a portion of the chunk identifiers in a chunk map is performed to indicate the new segment. The blind update includes writing to the chunk map without reading from the chunk map. In some examples, the objects comprise virtual machine disks (VMDKs) and the SUT changes between a list format and a bitmap format, to minimize size.

Patent Agency Ranking