EFFICIENT SEGMENT CLEANING EMPLOYING REMAPPING OF DATA BLOCKS IN LOG-STRUCTURED FILE SYSTEMS OF DISTRIBUTED DATA SYSTEMS

    公开(公告)号:US20210382634A1

    公开(公告)日:2021-12-09

    申请号:US16914166

    申请日:2020-06-26

    Applicant: VMware, Inc.

    Abstract: Client data is structured as a set of data blocks. A first subset of data blocks is stored on a current segment of the disks. A second subset of data blocks is stored on a previous segment. A request to clean client data is received, including a request to update the current segment to include the second subset of data blocks. The second subset of data blocks is accessed and transmitted from a lower layer to a higher system layer. Parity data is generated at the higher layer. The parity data is transmitted to the lower layer. The lower layer updates second mapping data. In the updated mapping of the second mapping data, each local address that references a data block of the second subset of data blocks is included in the current segment of the plurality of disks. The lower layer writes the parity data in the current segment.

    FAULT-TOLERANT UPLOADING OF DATA TO A DISTRIBUTED STORAGE SYSTEM

    公开(公告)号:US20220121532A1

    公开(公告)日:2022-04-21

    申请号:US17072961

    申请日:2020-10-16

    Applicant: VMware, Inc.

    Abstract: Techniques for the increased efficiency of storing data objects storage in the object storage of a software designed data center (SDDC) are provided. The techniques include the efficient storage of data, while enabling snapshots of each updating of the data. The snapshots of the data may be efficiently recovered via the techniques. Difference-level mappings for each snapshot are encoded in compact self-balancing data trees included in the object's metadata. The metadata mappings include mappings between various address spaces employed by the SDDC, as well as the address spaces employed by data stores that store the data on physical medium. Because the metadata is efficiently structured, the metadata for an object may be cached for quick lookups during data access and/or snapshot recovery. The techniques also provide low-latency recovery and/or system rollback in the event of any failure in the SDDC, including when the failure occurs while uploading a snapshot.

    DYNAMIC GROWTH OF DATA CACHES USING BACKGROUND PROCESSES FOR HASH BUCKET GROWTH

    公开(公告)号:US20240070080A1

    公开(公告)日:2024-02-29

    申请号:US17900642

    申请日:2022-08-31

    Applicant: VMware, Inc.

    CPC classification number: G06F12/0864 G06F2212/1016 G06F2212/604

    Abstract: The disclosure describes growing a data cache using a background hash bucket growth process. A first memory portion is allocated to the data buffer of the data cache and a second memory portion is allocated to the metadata buffer of the data cache based on the cache growth instruction. The quantity of hash buckets in the hash bucket buffer is increased and the background hash bucket growth process is initiated, wherein the process is configured to rehash hash bucket entries of the hash bucket buffer in the increased quantity of hash buckets. A data entry is stored in the data buffer using the allocated first memory portion of the data cache and metadata associated with the data entry is stored using the allocated second memory portion of the metadata buffer, wherein a hash bucket entry associated with the data entry is stored in the increased quantity of hash buckets.

    EFFICIENT SEGMENT CLEANING EMPLOYING LOCAL COPYING OF DATA BLOCKS IN LOG-STRUCTURED FILE SYSTEMS OF DISTRIBUTED DATA SYSTEMS

    公开(公告)号:US20210382826A1

    公开(公告)日:2021-12-09

    申请号:US16914171

    申请日:2020-06-26

    Applicant: VMware, Inc.

    Abstract: Client data is structured as a set of data blocks. A first subset of data blocks is stored on a current segment of a plurality of disks. A second subset of data blocks is stored on a previous segment. A request to clean client data is received. The request includes a request to update the current segment to include the second subset of data blocks. The second subset of data blocks is accessed and transmitted from a lower layer to a higher layer of the system. Parity data is generated at the higher layer. The parity data is transmitted to the lower layer. The lower layer is employed to generate a local copy of the second subset of data blocks. Each local address that references the local copy of the second subset of data blocks is included in the current segment. The parity data is written in the current segment.

    DISTRIBUTED OBJECT STORAGE SUPPORTING DIFFERENCE-LEVEL SNAPSHOTS

    公开(公告)号:US20220121365A1

    公开(公告)日:2022-04-21

    申请号:US17072904

    申请日:2020-10-16

    Applicant: VMware, Inc.

    Abstract: Techniques for the increased efficiency of storing data objects storage in the object storage of a software designed data center (SDDC) are provided. The techniques include the efficient storage of data, while enabling snapshots of each updating of the data. The snapshots of the data may be efficiently recovered via the techniques. Difference-level mappings for each snapshot are encoded in compact self-balancing data trees included in the object's metadata. The metadata mappings include mappings between various address spaces employed by the SDDC, as well as the address spaces employed by data stores that store the data on physical medium. Because the metadata is efficiently structured, the metadata for an object may be cached for quick lookups during data access and/or snapshot recovery. The techniques also provide low-latency recovery and/or system rollback in the event of any failure in the SDDC.

    EFFICIENT ERASURE-CODED STORAGE IN DISTRIBUTED DATA SYSTEMS

    公开(公告)号:US20210382858A1

    公开(公告)日:2021-12-09

    申请号:US16894663

    申请日:2020-06-05

    Applicant: VMware, Inc.

    Abstract: Techniques for efficiently storing client data blocks on a distributed-computing system are provided. The system includes a fast performance tier and a large capacity tier. The capacity tier stores the client data blocks in erasure encoded data stripes. The performance tier stores logical map data including an address map indicating a correspondence between logical addresses associated with a first layer of the system and physical addresses associated with a second layer. A method includes receiving a request to include additional client data blocks in the client blocks. The request indicates logical addresses for additional blocks. Corresponding physical addresses for additional block are determined. Each additional block is stored at the physical address. Additional logical map data is stored in the performance tier. Storing the additional logical map data includes updating the address map to indicate the correspondence between the logical addresses and the physical addresses for the additional blocks.

Patent Agency Ranking