Managing metadata for a backup data storage

    公开(公告)号:US11221944B1

    公开(公告)日:2022-01-11

    申请号:US17002667

    申请日:2020-08-25

    Applicant: VMware, Inc.

    Abstract: A method for managing metadata for data stored in a cloud storage is provided. The method receives, at a first of a plurality of metadata servers, information associated with an object stored in the cloud storage, the information comprising a plurality of LBAs for where the object is stored. Each metadata server allocates contiguous chunk IDs for a group of objects. The method generates a new chunk ID for the object, which is a combination of a unique fixed value and a monotonically incrementing local value associated with each LBA, such that a first LBA is mapped to a first chunk ID having a first local value and a next LBA is mapped to a second chunk ID having the first local value incremented as a second local value. The method stores the new chunk ID and other metadata in one or more tables stored in a metadata storage.

    EFFICIENT EXPORT OF SNAPSHOT CHANGES IN A STORAGE SYSTEM

    公开(公告)号:US20220004461A1

    公开(公告)日:2022-01-06

    申请号:US16920490

    申请日:2020-07-03

    Applicant: VMware, Inc.

    Abstract: Techniques for efficiently exporting snapshot changes are provided. In some embodiments, a system may receive a first snapshot of a data set in a storage system and a second snapshot the data set in the storage system. The system may further generate actions based on differences between the first snapshot and the second snapshot to produce a list of actions, wherein a modification to a file or directory path having a first directory location includes a first action to rename a file from the first directory location to a temporary storage location and a second action to rename the file from the temporary storage location to a second directory location; and provide the generated actions to a backup system. The backup system may apply the generated actions to a first backup associated with the first snapshot to produce a second backup associated with the second snapshot.

    Enhanced hash calculation in distributed datastores

    公开(公告)号:US11204706B2

    公开(公告)日:2021-12-21

    申请号:US16827648

    申请日:2020-03-23

    Applicant: VMware, Inc.

    Abstract: A method for generating one or more hashes for one or more data blocks is provided. The method receives a data block to write on at least one physical disk of a set of physical disks associated with a set of host machines. The method then calculates a hash for the received data block and writes a first entry to a data log in a cache disk, the first entry comprising a first header and data indicative of the received block, the first header comprising the hash. The method further writes the data to the at least one physical disk as part of data blocks of a stripe, and stores the hash in a summary block on the at least one physical disk. The summary block is associated with the data blocks of the stripe stored on the at least one physical disk.

    Systems and methods of resyncing data in erasure-coded objects with multiple failures

    公开(公告)号:US11182250B1

    公开(公告)日:2021-11-23

    申请号:US16920005

    申请日:2020-07-02

    Applicant: VMware, Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resynchronizing data in a storage system. One of the methods includes determining that a particular disk of a capacity object of a storage system is out-of-sync and that a primary disk is unavailable; and for each segment of one or more segments of the capacity object: generating a first version of the column of the segment corresponding to the unavailable primary disk; determining whether the data integrity token in the column summary of the generated first version is valid; and in response to determining that the data integrity token is valid, resynchronizing the column of the segment corresponding to the particular disk using i) the primary columns of the segment corresponding to each available primary disk and ii) the first version of the column of the segment corresponding to the unavailable primary disk.

    SUPPORTING DISTRIBUTED AND LOCAL OBJECTS USING A MULTI-WRITER LOG-STRUCTURED FILE SYSTEM

    公开(公告)号:US20210334236A1

    公开(公告)日:2021-10-28

    申请号:US16857517

    申请日:2020-04-24

    Applicant: VMware, Inc.

    Abstract: Supporting distributed and local objects using a multi-writer log-structured file system (LFS) includes, on a node, receiving incoming data from each of a plurality of local objects; coalescing the received data; determining whether the coalesced data comprises a full segment of data; based at least on the coalesced incoming data comprises a full segment, writing at least a first portion of the coalesced data to a first storage of the LFS, wherein the coalesced data comprises the first portion and a remainder portion; writing the remainder portion to a second storage of the LFS; acknowledging the writing to the objects; determining whether at least a full segment of data has accumulated in the second storage; based at least on determining that at least a full segment has accumulated, writing at least a portion of the accumulated data as one or more full segments of data to the first storage.

    Large range lookups for Bϵ-tree
    109.
    发明授权

    公开(公告)号:US11093471B2

    公开(公告)日:2021-08-17

    申请号:US16000142

    申请日:2018-06-05

    Applicant: VMware, Inc.

    Abstract: Embodiments herein are directed towards systems and methods for performing range lookups in Bε-trees. One example method involves receiving a request to return key-value pairs within a range of keys from the Bε-tree. The Bε-tree includes a plurality of nodes, each node being associated with a buffer that stores key-value pairs. The method further involves determining a fractional size of the range of keys. The method further involves, for each level of the Bε-tree, obtaining from within one or more buffers of one or more nodes of the level, a set of key-value pairs within the range of keys up to a size equal to the fractional size and transferring the set of key-value pairs to a result data structure. The method further involves sorting and merging all key-value pairs in the result data structure and returning the result data structure in response to the request.

    CPU-efficient cache replacment with two-phase eviction

    公开(公告)号:US11080189B2

    公开(公告)日:2021-08-03

    申请号:US16256726

    申请日:2019-01-24

    Applicant: VMware, Inc.

    Abstract: The present disclosure provides techniques for managing a cache of a computer system using a cache management data structure. The cache management data structure includes a cold queue, a ghost queue, and a hot queue. The techniques herein improve the functioning of the computer because management of the cache management data structure can be performed in parallel with multiple cores or multiple processors, because a sequential scan will only pollute (i.e., add unimportant memory pages) cold queue, and to an extent, ghost queue, but not hot queue, and also because the cache management data structure has lower memory requirements and lower CPU overhead on cache hit than some prior art algorithms.

Patent Agency Ranking