Patent search ap:("NetApp Inc.") AND inv:"Wei Sun" Page 3

21.

发明公开
COMBINED GARBAGE COLLECTION AND DATA INTEGRITY CHECKING FOR A DISTRIBUTED KEY-VALUE STORE 审中-公开

公开(公告)号：US20230145784A1

公开(公告)日：2023-05-11

申请号：US17680484

申请日：2022-02-25

Applicant: NetApp, Inc.

Inventor： Wei Sun , Mark David Olson , Anil Paul Thoppil

IPC: G06F12/02 , G06F16/22

CPC classification number: G06F12/0253 , G06F16/2246 , G06F16/2272

Abstract: Systems and methods are described for a streamlined garbage collection process during which data integrity checking is also performed for a distributed key-value (KV) store utilized by a distributed storage management system. According to one embodiment, by making use of full or truncated block IDs (rather than an intermediate probabilistic data structure, such as a Bloom filter) for garbage collection, data integrity checking can be performed concurrently almost for free. During garbage collection, a block ID compare list is compared to block IDs within the distributed KV store. If a particular block ID is present in the distributed KV store but is missing from the block ID compare list, the corresponding data block represents garbage to be collected. If the particular block ID is present in the block ID compare list but missing from the distributed KV store, a data integrity error has been identified.

22.

发明申请
COMBINED GARBAGE COLLECTION AND DATA INTEGRITY CHECKING FOR A DISTRIBUTED KEY-VALUE STORE 有权

公开(公告)号：US20240385959A1

公开(公告)日：2024-11-21

申请号：US18786848

申请日：2024-07-29

Applicant: NetApp, Inc.

Inventor： Wei Sun , Mark David Olson , Anil Paul Thoppil

IPC: G06F12/02 , G06F16/22

Abstract: Systems and methods are described for a streamlined garbage collection process during which data integrity checking is also performed for a distributed key-value (KV) store utilized by a distributed storage system. According to one embodiment, by making use of full or truncated block IDs (rather than an intermediate probabilistic data structure, such as a Bloom filter) for garbage collection, data integrity checking can be performed concurrently almost for free. During garbage collection, a block ID compare list may be compared to block IDs within the distributed KV store. If a particular block ID is present in the distributed KV store but is missing from the block ID compare list, the corresponding data block represents garbage to be collected. If the particular block ID is present in the block ID compare list but missing from the distributed KV store, a data integrity error has been identified.

23.

发明授权
Combined garbage collection and data integrity checking for a distributed key-value store 有权

公开(公告)号：US12066933B2

公开(公告)日：2024-08-20

申请号：US17680484

申请日：2022-02-25

Applicant: NetApp, Inc.

Inventor： Wei Sun , Mark David Olson , Anil Paul Thoppil

IPC: G06F12/02 , G06F16/22

CPC classification number: G06F12/0253 , G06F16/2246 , G06F16/2272

Abstract: Systems and methods are described for a streamlined garbage collection process during which data integrity checking is also performed for a distributed key-value (KV) store utilized by a distributed storage management system. According to one embodiment, by making use of full or truncated block IDs (rather than an intermediate probabilistic data structure, such as a Bloom filter) for garbage collection, data integrity checking can be performed concurrently almost for free. During garbage collection, a block ID compare list is compared to block IDs within the distributed KV store. If a particular block ID is present in the distributed KV store but is missing from the block ID compare list, the corresponding data block represents garbage to be collected. If the particular block ID is present in the block ID compare list but missing from the distributed KV store, a data integrity error has been identified.

24.

发明公开
GARBAGE COLLECTION AND BIN SYNCHRONIZATION FOR DISTRIBUTED STORAGE ARCHITECTURE 审中-公开

公开(公告)号：US20240220106A1

公开(公告)日：2024-07-04

申请号：US18607651

申请日：2024-03-18

Applicant: NetApp, Inc.

Inventor： Manan Dahyabhai Patel , Wei Sun

IPC: G06F3/06

CPC classification number: G06F3/0608 , G06F3/0652 , G06F3/067

Abstract: Techniques are provided for implementing garbage collection and bin synchronization for a distributed storage architecture of worker nodes managing distributed storage composed of bins of blocks. As the distributed storage architecture scales out to accommodate more storage and worker nodes, garbage collection used to free unused blocks becomes unmanageable and slow. Accordingly garbage collection is improved by utilizing heuristics to dynamically speed up or down garbage collection and set sizes for subsets of a bin to process instead of the entire bin. This ensures that garbage collection does not use stale information about what blocks are in-use, and ensures garbage collection does not unduly impact client I/O processing or conversely falls behind on garbage collection. Garbage collection can be incorporated into a bin sync process to improve the efficiency of the bin sync process so that unused blocks are not needlessly copied by the bin sync process.

25.

发明授权
Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors 有权

公开(公告)号：US11983080B2

公开(公告)日：2024-05-14

申请号：US17680621

申请日：2022-02-25

Applicant: NetApp, Inc.

Inventor： Wei Sun , Anil Paul Thoppil , Anne Maria Vasu

IPC: G06F11/10 , G06F3/06 , G06F11/16 , G06F11/30 , G06F16/27

CPC classification number: G06F11/1662 , G06F3/0622 , G06F3/064 , G06F3/0679 , G06F11/1088 , G06F11/3034 , G06F16/27

Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than making use of a generalized one-size-fits-all approach in an effort to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identification of a failed RAID stripe by a node of a cluster of a distributed storage management system, for each block ID of multiple block IDs associated with the failed RAID stripe, a data block is restored corresponding to the block ID by reading the data block from another node of the cluster having a redundant copy of the data block; and writing the redundant copy of the data block to a storage area of the node that is unaffected by the failed RAID stripe.

26.

发明授权
Defragmentation for log structured merge tree to improve read and write amplification 有权

公开(公告)号：US11971859B2

公开(公告)日：2024-04-30

申请号：US17732046

申请日：2022-04-28

Applicant: NetApp Inc.

Inventor： Anil Paul Thoppil , Wei Sun , Meera Odugoudar , Szu-Wen Kuo , Santhosh Selvaraj

IPC: G06F16/17 , G06F16/174 , G06F16/182

CPC classification number: G06F16/1748 , G06F16/182

Abstract: Techniques are provided for implementing a defragmentation process during a merge operation performed by a re-compaction process upon a log structured merge tree. The log structured merge tree is used to store keys of key-value pairs within a key-value store. As the log structured merge tree fills with keys over time, the re-compaction process is performed to merge keys down to lower levels of the log structured merge tree to re-compact the keys. Re-compaction can result in fragmentation because there is a lack of spatial locality of where the re-compaction operations re-writes the keys within storage. Fragmentation increases read and write amplification when accessing the keys stored in different locations within the storage. Accordingly, the defragmentation process is performed during a last merge operation of the re-compaction process in order to store keys together within the storage, thus reducing read and write amplification when accessing the keys.

27.

发明授权
Garbage collection and bin synchronization for distributed storage architecture 有权

公开(公告)号：US11934656B2

公开(公告)日：2024-03-19

申请号：US17717454

申请日：2022-04-11

Applicant: NetApp Inc.

Inventor： Manan Dahyabhai Patel , Wei Sun

IPC: G06F3/06

CPC classification number: G06F3/0608 , G06F3/0652 , G06F3/067

Abstract: Techniques are provided for implementing garbage collection and bin synchronization for a distributed storage architecture of worker nodes managing distributed storage composed of bins of blocks. As the distributed storage architecture scales out to accommodate more storage and worker nodes, garbage collection used to free unused blocks becomes unmanageable and slow. Accordingly garbage collection is improved by utilizing heuristics to dynamically speed up or down garbage collection and set sizes for subsets of a bin to process instead of the entire bin. This ensures that garbage collection does not use stale information about what blocks are in-use, and ensures garbage collection does not unduly impact client I/O processing or conversely falls behind on garbage collection. Garbage collection can be incorporated into a bin sync process to improve the efficiency of the bin sync process so that unused blocks are not needlessly copied by the bin sync process.

28.

发明授权
Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors 有权

公开(公告)号：US11934280B2

公开(公告)日：2024-03-19

申请号：US17680631

申请日：2022-02-25

Applicant: NetApp, Inc.

Inventor： Wei Sun , Anil Paul Thoppil , Anne Maria Vasu

IPC: G06F3/06 , G06F11/10 , G06F11/16 , G06F11/30 , G06F16/27

CPC classification number: G06F11/1662 , G06F3/0622 , G06F3/064 , G06F3/0679 , G06F11/1088 , G06F11/3034 , G06F16/27

Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.

29.

发明公开
USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS 审中-公开

公开(公告)号：US20240028486A1

公开(公告)日：2024-01-25

申请号：US18478149

申请日：2023-09-29

Applicant: NetApp, Inc.

Inventor： Wei Sun , Anil Paul Thoppil , Anne Maria Vasu

IPC: G06F11/16 , G06F16/27 , G06F11/10 , G06F11/30 , G06F3/06

CPC classification number: G06F11/1662 , G06F16/27 , G06F11/1088 , G06F11/3034 , G06F3/0622 , G06F3/064 , G06F3/0679

Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a first node of multiple nodes of distributed storage system represented in a form of a cluster of the multiple of nodes, identifies the potential existence of an error associated with a Redundant Array of Independent Disks (RAID) stripe. A list of block identifiers (IDs) associated with the RAID stripe may then be identified. Rather than performing a traditional RAID recovery/reconstruction approach that is resource intensive in nature and that requires an excessive amount of rebuild time, a more efficient RAID stripe resynchronization process may be performed to restore data associated with the RAID stripe.

30.

发明公开
USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS 审中-公开

公开(公告)号：US20230153214A1

公开(公告)日：2023-05-18

申请号：US17680653

申请日：2022-02-25

Applicant: NetApp, Inc.

Inventor： Wei Sun , Anil Paul Thoppil , Anne Maria Vasu

IPC: G06F11/16 , G06F11/30 , G06F11/10 , G06F16/27

CPC classification number: G06F11/1662 , G06F11/3034 , G06F11/1088 , G06F16/27

Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a KV store of a node of a cluster of a distributed storage management system manages storage of data blocks as values and corresponding block IDs as keys. Data integrity errors are reported to the first node in the form of a list of missing block IDs that are in use but missing from the KV store. A metadata resynchronization process may then be caused to be performed, including for each block ID in the list of missing block IDs: (i) reading a data block corresponding to the block ID from another node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the node.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification