-
公开(公告)号:US20220121365A1
公开(公告)日:2022-04-21
申请号:US17072904
申请日:2020-10-16
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Vamsidhar GUNTURU , Junlong GAO , Ilya LANGUEV , Petr VANDROVEC , Maxime AUSTRUY , Ilia SOKOLINSKI , Satish PUDI
Abstract: Techniques for the increased efficiency of storing data objects storage in the object storage of a software designed data center (SDDC) are provided. The techniques include the efficient storage of data, while enabling snapshots of each updating of the data. The snapshots of the data may be efficiently recovered via the techniques. Difference-level mappings for each snapshot are encoded in compact self-balancing data trees included in the object's metadata. The metadata mappings include mappings between various address spaces employed by the SDDC, as well as the address spaces employed by data stores that store the data on physical medium. Because the metadata is efficiently structured, the metadata for an object may be cached for quick lookups during data access and/or snapshot recovery. The techniques also provide low-latency recovery and/or system rollback in the event of any failure in the SDDC.
-
公开(公告)号:US20210382858A1
公开(公告)日:2021-12-09
申请号:US16894663
申请日:2020-06-05
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Vamsidhar GUNTURU , Eric KNAUFT , Pascal RENAULD
IPC: G06F16/18 , G06F16/182 , G06F16/17 , G06F16/22 , G06F16/2457 , G06F12/02 , G06F12/0817
Abstract: Techniques for efficiently storing client data blocks on a distributed-computing system are provided. The system includes a fast performance tier and a large capacity tier. The capacity tier stores the client data blocks in erasure encoded data stripes. The performance tier stores logical map data including an address map indicating a correspondence between logical addresses associated with a first layer of the system and physical addresses associated with a second layer. A method includes receiving a request to include additional client data blocks in the client blocks. The request indicates logical addresses for additional blocks. Corresponding physical addresses for additional block are determined. Each additional block is stored at the physical address. Additional logical map data is stored in the performance tier. Storing the additional logical map data includes updating the address map to indicate the correspondence between the logical addresses and the physical addresses for the additional blocks.
-
公开(公告)号:US20210349793A1
公开(公告)日:2021-11-11
申请号:US16870861
申请日:2020-05-08
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Vamsi GUNTURU , Enning XIANG , Eric KNAUFT
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resynchronizing data in a storage system. One of the methods includes determining that a particular disk of a capacity object of a storage system was offline for an interval of time, wherein the capacity object comprises a plurality of segments, and wherein the storage system comprises a segment usage table identifying a linked list of particular segments of the capacity object that are in use; determining a time point at which the particular disk went offline; determining one or more first segments of the capacity object that were modified after the time point, wherein determining one or more first segments comprises determining each segment of the segment usage table having a transaction ID that is larger than the time point; and resynchronizing, for each first segment, a portion of the particular disk corresponding to the first segment.
-
公开(公告)号:US20210349790A1
公开(公告)日:2021-11-11
申请号:US16870801
申请日:2020-05-08
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Enning XIANG , Vamsi GUNTURU , Eric KNAUFT , Pascal RENAULD
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resynchronizing data in a storage system. One of the methods includes determining that a particular primary disk of a capacity object of a storage system has failed, wherein the capacity comprises a plurality of segments, and wherein the each segment comprises: a plurality of primary columns each corresponding to a respective primary disk of the capacity object, and a plurality of parity columns each corresponding to a respective parity disk of the capacity object; and resynchronizing, for each segment of one or more segments of the capacity object, the primary column of the segment corresponding to the particular primary disk using i) the primary columns of the segment corresponding to each other primary disk of the capacity object, ii) one or more parity columns of the segment, and iii) the column summaries of the segment.
-
85.
公开(公告)号:US20210294502A1
公开(公告)日:2021-09-23
申请号:US16827692
申请日:2020-03-23
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Eric KNAUFT , Vamsi GUNTURU , Pascal RENAULD
Abstract: A method for encrypting data in one or more data blocks is provided. The method receives a first data block to be written to a physical storage that includes one or more physical disks. The method applies a first random tweak to data indicative of the first data block to generate a first encrypted data block, and writes the first encrypted data block and the first random tweak to a first physical block of the physical storage. The method receives a second data block to be written to the physical storage. The method then applies a second random tweak, different than the first random tweak, to data indicative of the second data block to generate a second encrypted data block, and writes the second encrypted data block and the second random tweak to a second physical block of the physical storage.
-
公开(公告)号:US20210294499A1
公开(公告)日:2021-09-23
申请号:US16827618
申请日:2020-03-23
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Vamsi GUNTURU
IPC: G06F3/06
Abstract: A method for performing write operations on a set of one or more physical disks of a set of one or more host machines is provided. The method receives a data block to write on at least one physical disk in the set of physical disks and generates a first set of one or more compressed sectors based on the received data block. The method writes (i) a first entry having a first header and the first set of compressed sectors to a data log that is maintained in a cache, and (ii) the first set of compressed sectors to a bank in memory. The method further determines if a size of data including compressed sectors in the bank satisfies a threshold, and when the size of data in the bank satisfies the threshold, writes the data to the at least one physical disk in the set of physical disks.
-
公开(公告)号:US20210117443A1
公开(公告)日:2021-04-22
申请号:US16658172
申请日:2019-10-21
Applicant: VMware, Inc.
Inventor: Haoran ZHENG , Wenguang WANG , Tao XIE , Yizheng CHEN
Abstract: A distributed storage system, such as a distributed storage system in a virtualized computing environment, stores data in storage nodes as immutable key-value entries. A coordinator storage node creates a key-value entry and attempts to store the key-value entry in the coordinator storage node and in neighbor storage nodes. If the storage of the key-value entry in the in the coordinator storage node and in the neighbor storage node is successful, the coordinator storage node pushes the key-value entry to other storage nodes in the distributed storage system for storage as replicas.
-
公开(公告)号:US20210064581A1
公开(公告)日:2021-03-04
申请号:US16552976
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/13 , G06F16/172
Abstract: The present disclosure provides techniques for deduplicating files. The techniques include creating a cache or subset of a large data structure. The large data structure organizes information by random hash values. The random hash values result in a random organization of information within the data structure, with the information spanning a large number of storage blocks within a storage system. The cache, however, is within memory and is small relative to the data structure. The cache is created so as to contain information that is likely to be needed during deduplication of a file. Having needed information within memory rather than in storage results in faster read and write operations to that information, improving the performance of a computing system.
-
公开(公告)号:US20210064579A1
公开(公告)日:2021-03-04
申请号:US16552908
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/14
Abstract: Disclosed techniques include deduplication. Techniques include determining whether a file is unique, and depending on whether the file is unique, deduplicating only part of the file or the entire file. The techniques include processing the first chunk of a file to determine whether the hash of the chunk hash is already within a chunk hash table, and if not, then a percentage of chunks of the file is similarly processed. If any of the hashes of chunks are already in the chunk hash table, then at least some of file has been previously deduplicated, and file is not unique the storage system. If none of the processed chunks have a hash that is already in the chunk hash table, then the file is considered to be unique within chunk store and only a partial percentage of the file's chunks are deduplicated. Not all of a unique file's chunks are deduplicated.
-
公开(公告)号:US20210064522A1
公开(公告)日:2021-03-04
申请号:US16552954
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
Abstract: The present disclosure provides techniques for deallocating previously allocated storage blocks. The techniques include obtaining a list of chunk IDs to analyze, choosing a chunk ID, and determining the storage blocks spanned by the chunk corresponding to the chosen chunk ID. The technique further includes determining whether any file references any storage blocks spanned by the chunk. The determining may be performed by comparing an internal reference count to a total reference count, where the internal reference count is the number of reference to the storage block by a chunk ID data structure. If no files reference any of the storage blocks spanned by the chunk, then all the storage blocks of the chunk can be deallocated.
-
-
-
-
-
-
-
-
-