Storing data in a log-structured format in a two-tier storage system

    公开(公告)号:US11803469B2

    公开(公告)日:2023-10-31

    申请号:US17410673

    申请日:2021-08-24

    Applicant: VMware, Inc.

    CPC classification number: G06F12/0804 G06F12/1009 G06F16/2246 G06F2212/1032

    Abstract: The disclosure herein describes storing data using a capacity data storage tier and a smaller performance data storage tier. The capacity data storage tier includes capacity data storage hardware configured to store log-structured leaf pages (LLPs), and the performance data storage tier includes performance data storage hardware. A virtual address table (VAT) includes a set of virtual address entries referencing the LLPs. A tree-structured index includes index nodes referencing the set of virtual address entries of the VAT. Data to be stored is received, and at least a first portion of metadata associated with the received data is stored in the LLPs using the VAT, and at least a second portion of metadata associated with the received data is stored in the performance data storage tier. The architecture reduces space usage of the performance data storage tier.

    Micro-batching metadata updates to reduce transaction journal overhead during snapshot deletion

    公开(公告)号:US11797214B2

    公开(公告)日:2023-10-24

    申请号:US17646993

    申请日:2022-01-04

    Applicant: VMware, Inc.

    CPC classification number: G06F3/0652 G06F3/064 G06F3/0604 G06F3/0679

    Abstract: A method for deleting one or more snapshots using micro-batch processing is provided. The method includes receiving a request to delete the one or more snapshots, identifying one or more middle map extents exclusively owned by the one or more snapshots requested to be deleted, wherein metadata for the one or more snapshots is stored in one or more logical maps having logical map extents mapping logical block addresses (LBAs) to middle block addresses (MBAs) and a middle map having middle map extents mapping MBAs to physical block addresses (PBAs) of physical locations where data blocks are written, adding MBAs of the identified one or more middle map extents in a batch, determining a first micro-batch including a first subset of the MBAs in the batch, the first subset of MBAs being MBAs less than a first upper bound MBA, and using a first transaction to delete the middle map extents corresponding to the first subset of MBAs included in the first micro-batch.

    Probabilistic algorithm to check whether a file is unique for deduplication

    公开(公告)号:US11669495B2

    公开(公告)日:2023-06-06

    申请号:US16552908

    申请日:2019-08-27

    Applicant: VMware, Inc.

    CPC classification number: G06F16/1752 G06F16/152

    Abstract: Disclosed techniques include deduplication. Techniques include determining whether a file is unique, and depending on whether the file is unique, deduplicating only part of the file or the entire file. The techniques include processing the first chunk of a file to determine whether the hash of the chunk hash is already within a chunk hash table, and if not, then a percentage of chunks of the file is similarly processed. If any of the hashes of chunks are already in the chunk hash table, then at least some of file has been previously deduplicated, and file is not unique the storage system. If none of the processed chunks have a hash that is already in the chunk hash table, then the file is considered to be unique within chunk store and only a partial percentage of the file's chunks are deduplicated. Not all of a unique file's chunks are deduplicated.

    Failure analysis system for a distributed storage system

    公开(公告)号:US11599435B2

    公开(公告)日:2023-03-07

    申请号:US16540080

    申请日:2019-08-14

    Applicant: VMware, Inc.

    Abstract: A failure analysis system identifies a root cause of a failure (or other health issue) in a virtualized computing environment and provides a recommendation for remediation. The failure analysis system uses a model-based reasoning (MBR) approach that involves building a model describing the relationships/dependencies of elements in the various layers of the virtualized computing environment, and the model is used by an inference engine to generate facts and rules for reasoning to identify an element in the virtualized computing environment that is causing the failure. Then, then the failure analysis system uses a decision tree analysis (DTA) approach to perform a deep diagnosis of the element, by traversing a decision tree that was generated by combining the rules for reasoning provided by the MBR approach, in conjunction with examining data collected by health monitors. The result of the DTA approach is then used to generate the recommendation for remediation.

    EMBEDDED REFERENCE COUNTS FOR FILE CLONES

    公开(公告)号:US20230028391A1

    公开(公告)日:2023-01-26

    申请号:US17960023

    申请日:2022-10-04

    Applicant: VMware, Inc.

    Abstract: Techniques for efficiently managing a file clone from a filesystem which supports efficient volume snapshots are provided. In some embodiments, a system may receive an instruction to remove the file clone from the filesystem. The file clone may be a point-in-time copy of metadata of an original file. The system may further—for a file map entry in a filesystem tree associated with the file clone, the file map entry indicating a data block—decrement a reference count in a reference count entry associated with the file map entry. The reference count entry may be stored in the filesystem tree according to a key and the key may comprise an identification of the original file. The system may further reclaim the data block in a storage system when the reference count is zero.

Patent Agency Ranking