DYNAMIC GROWTH OF DATA CACHES USING BACKGROUND PROCESSES FOR HASH BUCKET GROWTH

    公开(公告)号:US20240070080A1

    公开(公告)日:2024-02-29

    申请号:US17900642

    申请日:2022-08-31

    Applicant: VMware, Inc.

    CPC classification number: G06F12/0864 G06F2212/1016 G06F2212/604

    Abstract: The disclosure describes growing a data cache using a background hash bucket growth process. A first memory portion is allocated to the data buffer of the data cache and a second memory portion is allocated to the metadata buffer of the data cache based on the cache growth instruction. The quantity of hash buckets in the hash bucket buffer is increased and the background hash bucket growth process is initiated, wherein the process is configured to rehash hash bucket entries of the hash bucket buffer in the increased quantity of hash buckets. A data entry is stored in the data buffer using the allocated first memory portion of the data cache and metadata associated with the data entry is stored using the allocated second memory portion of the metadata buffer, wherein a hash bucket entry associated with the data entry is stored in the increased quantity of hash buckets.

    EFFICIENT REPLICATION OF FILE CLONES

    公开(公告)号:US20220414064A1

    公开(公告)日:2022-12-29

    申请号:US17357044

    申请日:2021-06-24

    Applicant: VMware, Inc.

    Abstract: A method for managing replication of cloned files is provided. Embodiments include determining, at a source system, that a first file has been cloned to create a second file. Embodiments include sending, from the source system to a replica system, an address of the first extent and an indication that a status of the first extent has changed from non-cloned to cloned. Embodiments include changing, at the replica system, a status of a second extent associated with a replica of the first file on the replica system from non-cloned to cloned and creating a mapping of the address of the first extent to an address of the second extent on the replica system. Embodiments include creating, at the replica system, a replica of the second file comprising a reference to the address of the second extent on the replica system.

    FAST ALGORITHM TO FIND FILE SYSTEM DIFFERENCE FOR DEDUPLICATION

    公开(公告)号:US20210064580A1

    公开(公告)日:2021-03-04

    申请号:US16552965

    申请日:2019-08-27

    Applicant: VMware, Inc.

    Abstract: The disclosure provides techniques for deduplicating files. The techniques include, upon creating or modifying a file, placing a logical timestamp of the current logical time, within a queue associated with the directory of the file. The techniques further include placing the logical timestamp within a queue of each parent directory of the directory of the file. To determine a set of files for deduplication, the techniques disclosed herein identify files that have been modified within a logical time range. The set of files modified within a logical time is identified by traversing directories of a storage system, the directories being organized within a tree structure. If a directory's queue does not contain a timestamp that is within the logical time range, then all child directories can be skipped over for further processing, such that no files within the child directories end up being within the set of files for deduplication.

    FAULT-TOLERANT UPLOADING OF DATA TO A DISTRIBUTED STORAGE SYSTEM

    公开(公告)号:US20220121532A1

    公开(公告)日:2022-04-21

    申请号:US17072961

    申请日:2020-10-16

    Applicant: VMware, Inc.

    Abstract: Techniques for the increased efficiency of storing data objects storage in the object storage of a software designed data center (SDDC) are provided. The techniques include the efficient storage of data, while enabling snapshots of each updating of the data. The snapshots of the data may be efficiently recovered via the techniques. Difference-level mappings for each snapshot are encoded in compact self-balancing data trees included in the object's metadata. The metadata mappings include mappings between various address spaces employed by the SDDC, as well as the address spaces employed by data stores that store the data on physical medium. Because the metadata is efficiently structured, the metadata for an object may be cached for quick lookups during data access and/or snapshot recovery. The techniques also provide low-latency recovery and/or system rollback in the event of any failure in the SDDC, including when the failure occurs while uploading a snapshot.

    SYSTEM AND METHOD OF A HIGHLY CONCURRENT CACHE REPLACEMENT ALGORITHM

    公开(公告)号:US20210141728A1

    公开(公告)日:2021-05-13

    申请号:US16679570

    申请日:2019-11-11

    Applicant: VMware, Inc.

    Abstract: Disclosed are a method and system for managing multi-threaded concurrent access to a cache data structure. The cache data structure includes a hash table and three queues. The hash table includes a list of elements for each hash bucket with each hash bucket containing a mutex object and elements in each of the queues containing lock objects. Multiple threads can each lock a different hash bucket to have access to the list, and multiple threads can each lock a different element in the queues. The locks permit highly concurrent access to the cache data structure without conflict. Also, atomic operations are used to obtain pointers to elements in the queues so that a thread can safely advance each pointer. Race conditions that are encountered with locking an element in the queues or entering an element into the hash table are detected, and the operation encountering the race condition is retried.

    SYSTEM AND METHOD FOR REDUCING READ AMPLIFICATION OF ARCHIVAL STORAGE USING PROACTIVE CONSOLIDATION

    公开(公告)号:US20220197861A1

    公开(公告)日:2022-06-23

    申请号:US17131155

    申请日:2020-12-22

    Applicant: VMware, Inc.

    Abstract: System and method for managing snapshots of storage objects in a storage system use a consolidation operation to reduce read amplification for stored snapshots of a storage object that are stored in log segments in the storage system according to a log-structured file system as storage service objects. The consolidation operation involves identifying target log segments among the log segments that include live blocks that are associated with the latest snapshot of the storage object and determining the number of the live blocks included in each of the target log segments. Based on the number of the live blocks in each of the target log segments, candidate consolidation log segments are determined from the target log segments. The live blocks in the candidate consolidation log segments are then consolidated to new log segments, which are uploaded to the storage system as new storage service objects.

    DISTRIBUTED OBJECT STORAGE SUPPORTING DIFFERENCE-LEVEL SNAPSHOTS

    公开(公告)号:US20220121365A1

    公开(公告)日:2022-04-21

    申请号:US17072904

    申请日:2020-10-16

    Applicant: VMware, Inc.

    Abstract: Techniques for the increased efficiency of storing data objects storage in the object storage of a software designed data center (SDDC) are provided. The techniques include the efficient storage of data, while enabling snapshots of each updating of the data. The snapshots of the data may be efficiently recovered via the techniques. Difference-level mappings for each snapshot are encoded in compact self-balancing data trees included in the object's metadata. The metadata mappings include mappings between various address spaces employed by the SDDC, as well as the address spaces employed by data stores that store the data on physical medium. Because the metadata is efficiently structured, the metadata for an object may be cached for quick lookups during data access and/or snapshot recovery. The techniques also provide low-latency recovery and/or system rollback in the event of any failure in the SDDC.

Patent Agency Ranking