Low-latency shared memory channel across address spaces in a computing system

    公开(公告)号:US11513832B2

    公开(公告)日:2022-11-29

    申请号:US17013727

    申请日:2020-09-07

    Applicant: VMWARE, INC.

    Abstract: Examples provide a method of communication between a client driver and a filesystem server. The client driver executes in a virtual machine (VM) and the filesystem server executes in a hypervisor. The method includes: allocating, by the client driver, shared memory in an address space of the VM for the communication; sending identification information for the shared memory from the client driver to the filesystem server through an inter-process communication channel between the client driver and the filesystem server; identifying, by the filesystem server in cooperation with a kernel of the hypervisor, the shared memory within an address space of the hypervisor, based on the identification information, to create a shared memory channel; sending commands from the client driver to the filesystem server through the shared memory channel; and receiving completion messages for the commands from the filesystem server to the client driver through the shared memory channel.

    SNAPSHOT SPACE REPORTING USING A PROBABILISTIC DATA STRUCTURE

    公开(公告)号:US20220342848A1

    公开(公告)日:2022-10-27

    申请号:US17239239

    申请日:2021-04-23

    Applicant: VMware, Inc.

    Abstract: The present disclosure is related to methods, systems, and machine-readable media for snapshot space reporting. A first probabilistic data structure can be created for a first snapshot of a virtual computing instance (VCI) in a file system based on a hash of physical block numbers of a plurality of blocks of the first snapshot. A second probabilistic data structure can be created for a second snapshot of the VCI based on a hash of physical block numbers of a plurality of blocks of the second snapshot. A space report can be determined for the first and second snapshots based on the first probabilistic data structure and the second probabilistic data structure, wherein the space report is indicative of the storage space occupied by the first and second snapshots. A file system function can be performed by reference to the space report.

    Combining the metadata and data address spaces of a distributed storage object via a composite object configuration tree

    公开(公告)号:US11474719B1

    公开(公告)日:2022-10-18

    申请号:US17320023

    申请日:2021-05-13

    Applicant: VMware, Inc.

    Abstract: Techniques for combining the metadata and data address spaces of a distributed storage object are provided. In one set of embodiments, a distributed storage system can receive a request to provision a storage object. In response, the distributed storage system can create, in accordance with an erasure coding scheme, one or more capacity components for holding data of the storage object; create, in accordance with a mirroring scheme having an equivalent level of fault tolerance as the erasure coding scheme, one or more metadata components for holding metadata of the storage object; and create a composite object configuration tree for the storage object that includes first and second subtrees, where the first subtree comprises an indication of the mirroring scheme and references to the one or more metadata components, and where the second subtree comprises an indication of the erasure coding scheme and references to the one or more capacity components.

    Issuing efficient writes to erasure coded objects in a distributed storage system via adaptive logging

    公开(公告)号:US11467746B2

    公开(公告)日:2022-10-11

    申请号:US17089605

    申请日:2020-11-04

    Applicant: VMware, Inc.

    Abstract: Techniques for issuing efficient writes to an erasure coded storage object in a distributed storage system via adaptive logging are provided. In one set of embodiments, a node of the system can receive a write request for updating one or more logical data blocks of the storage object and determine whether a size of the one or more logical data blocks meets or exceeds a threshold size. Upon determining that the size of the one or more logical data blocks meets or exceeds the threshold size, the node can allocate a segment in a capacity object of the storage object, write the one or more logical data blocks via a full stripe write to the segment, and write metadata for the one or more logical data blocks to a log record in a log of a metadata object of the storage object. The metadata written to the log record can include mappings between logical block addresses (LBAs) of the one or more logical data blocks and physical block addresses (PBAs) where the one or more logical data blocks reside in the segment.

    Merge updates for key value stores
    56.
    发明授权

    公开(公告)号:US11436353B2

    公开(公告)日:2022-09-06

    申请号:US15703706

    申请日:2017-09-13

    Applicant: VMware, Inc.

    Abstract: Embodiments of the present disclosure relate to techniques for performing a merge update for a database. In particular, certain embodiments of a method include generating a message comprising a first key and a first transaction associated with the first key, the first transaction indicating a transaction to perform other than for key-value pairs comprising the first key. The method further includes storing the message in a database. The method further includes merging the message with a first key-value pair stored in the database, the first-key value pair comprising the first key. The method further includes performing the first transaction based on merging the message with the first key-value pair.

    Log-structured formats for managing archived storage of objects

    公开(公告)号:US11436102B2

    公开(公告)日:2022-09-06

    申请号:US16998060

    申请日:2020-08-20

    Applicant: VMware, Inc.

    Abstract: Solutions for managing archived storage include receiving, at a first node, a snapshot comprising object data (e.g., a virtual machine disk snapshot) from a second node (e.g., a software defined data center), and storing the snapshot in a tiered structure that includes a data tier and a metadata tier. Snapshots may be used for fail-over operations and/or backups, to support disaster recovery. The data tier comprises a log-structured file system (LFS), and the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS. The metadata tier also comprises a logical layer indicating content in the CAS. Segment cleaning of the data tier is performed using a segment usage table (SUT). Some examples include performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. In some examples, the CAS comprises a log-structured merge-tree (LSM-tree).

    Upgrading on-disk format without service interruption

    公开(公告)号:US11334482B2

    公开(公告)日:2022-05-17

    申请号:US16933162

    申请日:2020-07-20

    Applicant: VMware, Inc.

    Abstract: A logical map represents fragments from separate versions of a data object. Migration of data from a first (old) version to the second (new) version happens gradually, where write operations go to the new version of the data object. The logical map initially points to the old data object, but is updated to point to the portions of the new data object as write operations are performed on the new data object. A background migration copies data from the old data object to the new data object.

    Using segment pre-allocation to support large segments

    公开(公告)号:US11334276B2

    公开(公告)日:2022-05-17

    申请号:US16842635

    申请日:2020-04-07

    Applicant: VMware, Inc.

    Abstract: Techniques for supporting large segments when issuing writes to an erasure coded storage object in a distributed storage system are provided. In one set of embodiments, a node of the system can pre-allocate a segment of space in a capacity object of the storage object, receive a write request for updating a logical data block of the storage object, write data/metadata for the block to a record in a data log of a metadata object of the storage object, place the block in an in-memory bank, and determine whether the in-memory bank has become full. If so, the node can compute/fill-in one or more parity blocks for each stripe of the storage object in the in-memory bank and write, based on a next sub-segment pointer pointing to a free sub-segment of the pre-allocated segment, the contents of the in-memory bank via a full stripe write to the free sub-segment.

Patent Agency Ranking