EFFICIENT WRITE-BACK FOR JOURNAL TRUNCATION
    1.
    发明公开

    公开(公告)号:US20240078179A1

    公开(公告)日:2024-03-07

    申请号:US17929197

    申请日:2022-09-01

    申请人: VMware, Inc.

    IPC分类号: G06F12/0804 G06F12/0882

    CPC分类号: G06F12/0804 G06F12/0882

    摘要: A method for efficient write-back for journal truncation is provided. A method includes maintaining a journal in a memory of a computing system including a plurality of records. Each record indicates a transaction associated with one or more pages in an ordered data structure and maintaining a dirty list including an entry for each page indicated by a record in the journal. Each entry in the dirty list includes a respective first log sequence number (LSN) associated with a least recent record of the plurality of records that indicates the page and a respective second LSN associated with a most recent record of the plurality of records that indicates the page. The method includes determining to truncate the journal. The method includes identifying one or more records, of the plurality of records, from the journal to write back to a disk, where the identifying is based on the dirty list.

    Fast algorithm to find file system difference for deduplication

    公开(公告)号:US11775484B2

    公开(公告)日:2023-10-03

    申请号:US16552965

    申请日:2019-08-27

    申请人: VMware, Inc.

    摘要: The disclosure provides techniques for deduplicating files. The techniques include, upon creating or modifying a file, placing a logical timestamp of the current logical time, within a queue associated with the directory of the file. The techniques further include placing the logical timestamp within a queue of each parent directory of the directory of the file. To determine a set of files for deduplication, the techniques disclosed herein identify files that have been modified within a logical time range. The set of files modified within a logical time is identified by traversing directories of a storage system, the directories being organized within a tree structure. If a directory's queue does not contain a timestamp that is within the logical time range, then all child directories can be skipped over for further processing, such that no files within the child directories end up being within the set of files for deduplication.

    EFFICIENT JOURNAL LOG RECORD FOR COPY-ON-WRITE B+ TREE OPERATION

    公开(公告)号:US20230177069A1

    公开(公告)日:2023-06-08

    申请号:US17643268

    申请日:2021-12-08

    申请人: VMware, Inc.

    IPC分类号: G06F16/27 G06F16/22

    CPC分类号: G06F16/27 G06F16/2246

    摘要: A method for copy on write (COW) operations generally includes receiving a write request to a first node in an ordered data structure and updating a write ahead log record associated with COW operation with, instead of the content of the first node, a physical disk address of a second node owned by the run point in the ordered data structure that is a parent node of the first node, a pointer to the first node in the second node, a physical disk address of the first node, and a physical disk address of the third node. A metadata table record for a snapshot that owns the first node may be updated with a log sequence number (LSN) of the COW operation. A method for deleting a snapshot includes determining whether the COW operation recorded in the WAL record for the LSN is completed before deleting the snapshot.

    Scalable segment cleaning for a log-structured file system

    公开(公告)号:US11494110B2

    公开(公告)日:2022-11-08

    申请号:US16999897

    申请日:2020-08-21

    申请人: VMware, Inc.

    IPC分类号: G06F3/06 G06F16/17

    摘要: Scalable segment cleaning for log-structured file systems (LFSs) includes determining counts of segment cleaners and virtual nodes, with each virtual node being associated with a plurality of objects. Each virtual node is assigned to a selected segment cleaner. Based at least on the assignments, performing, for each virtual node, segment cleaning of the objects by the assigned segment cleaner. A portion, less than all, of the virtual nodes are reassigned to a newly selected segment cleaner based on a change of the count of the segment cleaners and/or a change of the count of the virtual nodes. Based at least on the reassignments, segment cleaning of the objects is performed, for each reassigned virtual node, by the reassigned segment cleaner. In some examples, the objects comprise virtual machine disks (VMDKs) and the segment cleaning uses a segment usage table (SUT) to track segment usage and identify segment cleaning candidates.

    Shrinking segment cleaning algorithm in an object storage

    公开(公告)号:US11435935B2

    公开(公告)日:2022-09-06

    申请号:US17100663

    申请日:2020-11-20

    申请人: VMware, Inc.

    IPC分类号: G06F3/06

    摘要: A method for cleaning an object storage having a plurality of segments is provided. Each segment includes an identifier through which the segment is accessed. The method identifies a first segment in the plurality of segments. The first segment includes a first identifier and a first size. The method determines that a utilization ratio for the first segment is below a threshold. As a result, the method generates a second segment from the first segment, such that the second segment includes a second identifier that is the same as the first identifier and a second size that is smaller than the first size. The method then writes the second segment to the object storage.

    SCALABLE SEGMENT CLEANING FOR A LOG-STRUCTURED FILE SYSTEM

    公开(公告)号:US20220057955A1

    公开(公告)日:2022-02-24

    申请号:US16999897

    申请日:2020-08-21

    申请人: VMware, Inc.

    IPC分类号: G06F3/06 G06F16/17

    摘要: Scalable segment cleaning for log-structured file systems (LFSs) includes determining counts of segment cleaners and virtual nodes, with each virtual node being associated with a plurality of objects. Each virtual node is assigned to a selected segment cleaner. Based at least on the assignments, performing, for each virtual node, segment cleaning of the objects by the assigned segment cleaner. A portion, less than all, of the virtual nodes are reassigned to a newly selected segment cleaner based on a change of the count of the segment cleaners and/or a change of the count of the virtual nodes. Based at least on the reassignments, segment cleaning of the objects is performed, for each reassigned virtual node, by the reassigned segment cleaner. In some examples, the objects comprise virtual machine disks (VMDKs) and the segment cleaning uses a segment usage table (SUT) to track segment usage and identify segment cleaning candidates.

    Writing data to an LSM tree file structure using consistent cache staging

    公开(公告)号:US11620261B2

    公开(公告)日:2023-04-04

    申请号:US16213815

    申请日:2018-12-07

    申请人: VMware, Inc.

    摘要: The disclosure herein describes writing data to a log-structured merge (LSM) tree file system on an object storage platform. Write data instructions indicating data for writing to the LSM tree file system are received. Based on the received instructions, the data is written to the first data cache. Based on an instruction to transfer data in the live data cache to the LSM tree file system, the first data cache is converted to a stable cache. A second data cache configured as a live data cache is then generated based on cloning the first data cache. The data in the first data cache is then written to the LSM tree file system. Use of a stable cache and a cloned live data cache enables parallel writing data to the file system by the stable cache and handling write data instructions by the live data cache.

    SCALABLE SEGMENT CLEANING FOR A LOG-STRUCTURED FILE SYSTEM

    公开(公告)号:US20230067709A1

    公开(公告)日:2023-03-02

    申请号:US18048170

    申请日:2022-10-20

    申请人: VMware, Inc.

    IPC分类号: G06F3/06 G06F16/17

    摘要: Scalable segment cleaning for log-structured file systems (LFSs) includes determining counts of segment cleaners and virtual nodes, with each virtual node being associated with a plurality of objects. Each virtual node is assigned to a selected segment cleaner. Based at least on the assignments, performing, for each virtual node, segment cleaning of the objects by the assigned segment cleaner. A portion, less than all, of the virtual nodes are reassigned to a newly selected segment cleaner based on a change of the count of the segment cleaners and/or a change of the count of the virtual nodes. Based at least on the reassignments, segment cleaning of the objects is performed, for each reassigned virtual node, by the reassigned segment cleaner. In some examples, the objects comprise virtual machine disks (VMDKs) and the segment cleaning uses a segment usage table (SUT) to track segment usage and identify segment cleaning candidates.

    Efficient garbage collection of variable size chunking deduplication

    公开(公告)号:US11461229B2

    公开(公告)日:2022-10-04

    申请号:US16552954

    申请日:2019-08-27

    申请人: VMware, Inc.

    IPC分类号: G06F3/06 G06F12/02

    摘要: The present disclosure provides techniques for deallocating previously allocated storage blocks. The techniques include obtaining a list of chunk IDs to analyze, choosing a chunk ID, and determining the storage blocks spanned by the chunk corresponding to the chosen chunk ID. The technique further includes determining whether any file references any storage blocks spanned by the chunk. The determining may be performed by comparing an internal reference count to a total reference count, where the internal reference count is the number of reference to the storage block by a chunk ID data structure. If no files reference any of the storage blocks spanned by the chunk, then all the storage blocks of the chunk can be deallocated.