-
公开(公告)号:US11835990B2
公开(公告)日:2023-12-05
申请号:US17680653
申请日:2022-02-25
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
CPC classification number: G06F11/1662 , G06F3/064 , G06F3/0622 , G06F3/0679 , G06F11/1088 , G06F11/3034 , G06F16/27
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a KV store of a node of a cluster of a distributed storage management system manages storage of data blocks as values and corresponding block IDs as keys. Data integrity errors are reported to the first node in the form of a list of missing block IDs that are in use but missing from the KV store. A metadata resynchronization process may then be caused to be performed, including for each block ID in the list of missing block IDs: (i) reading a data block corresponding to the block ID from another node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the node.
-
公开(公告)号:US20250094295A1
公开(公告)日:2025-03-20
申请号:US18962013
申请日:2024-11-27
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, an instance of a key-value (KV) store of a first node of a plurality of nodes of a cluster of a distributed storage system manages storage of data blocks as values and corresponding block identifiers (IDs) as keys. A list of missing block IDs that are in use for one or more volumes associated with the first node but that are missing from the instance of the KV store are identified by performing a data integrity check on the instance of the KV store. After identifying the list of missing block IDs, instead of treating the first node as failed, restoring the missing block IDs by writing redundant data blocks retrieved from other nodes within the cluster to the first node.
-
公开(公告)号:US12164397B2
公开(公告)日:2024-12-10
申请号:US18478149
申请日:2023-09-29
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a first node of multiple nodes of distributed storage system represented in a form of a cluster of the multiple of nodes, identifies the potential existence of an error associated with a Redundant Array of Independent Disks (RAID) stripe. A list of block identifiers (IDs) associated with the RAID stripe may then be identified. Rather than performing a traditional RAID recovery/reconstruction approach that is resource intensive in nature and that requires an excessive amount of rebuild time, a more efficient RAID stripe resynchronization process may be performed to restore data associated with the RAID stripe.
-
4.
公开(公告)号:US20240220377A1
公开(公告)日:2024-07-04
申请号:US18608742
申请日:2024-03-18
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
CPC classification number: G06F11/1662 , G06F3/0622 , G06F3/064 , G06F3/0679 , G06F11/1088 , G06F11/3034 , G06F16/27
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, after identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.
-
公开(公告)号:US12253920B2
公开(公告)日:2025-03-18
申请号:US18608742
申请日:2024-03-18
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, after identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.
-
6.
公开(公告)号:US20230153213A1
公开(公告)日:2023-05-18
申请号:US17680621
申请日:2022-02-25
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
CPC classification number: G06F11/1662 , G06F11/3034 , G06F16/27
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than making use of a generalized one-size-fits-all approach in an effort to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identification of a failed RAID stripe by a node of a cluster of a distributed storage management system, for each block ID of multiple block IDs associated with the failed RAID stripe, a data block is restored corresponding to the block ID by reading the data block from another node of the cluster having a redundant copy of the data block; and writing the redundant copy of the data block to a storage area of the node that is unaffected by the failed RAID stripe.
-
7.
公开(公告)号:US20230152986A1
公开(公告)日:2023-05-18
申请号:US17680631
申请日:2022-02-25
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
IPC: G06F3/06
CPC classification number: G06F3/0622 , G06F3/064 , G06F3/0679
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.
-
公开(公告)号:US11983080B2
公开(公告)日:2024-05-14
申请号:US17680621
申请日:2022-02-25
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
CPC classification number: G06F11/1662 , G06F3/0622 , G06F3/064 , G06F3/0679 , G06F11/1088 , G06F11/3034 , G06F16/27
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than making use of a generalized one-size-fits-all approach in an effort to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identification of a failed RAID stripe by a node of a cluster of a distributed storage management system, for each block ID of multiple block IDs associated with the failed RAID stripe, a data block is restored corresponding to the block ID by reading the data block from another node of the cluster having a redundant copy of the data block; and writing the redundant copy of the data block to a storage area of the node that is unaffected by the failed RAID stripe.
-
公开(公告)号:US11934280B2
公开(公告)日:2024-03-19
申请号:US17680631
申请日:2022-02-25
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
CPC classification number: G06F11/1662 , G06F3/0622 , G06F3/064 , G06F3/0679 , G06F11/1088 , G06F11/3034 , G06F16/27
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.
-
10.
公开(公告)号:US20240028486A1
公开(公告)日:2024-01-25
申请号:US18478149
申请日:2023-09-29
Applicant: NetApp, Inc.
Inventor: Wei Sun , Anil Paul Thoppil , Anne Maria Vasu
CPC classification number: G06F11/1662 , G06F16/27 , G06F11/1088 , G06F11/3034 , G06F3/0622 , G06F3/064 , G06F3/0679
Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a first node of multiple nodes of distributed storage system represented in a form of a cluster of the multiple of nodes, identifies the potential existence of an error associated with a Redundant Array of Independent Disks (RAID) stripe. A list of block identifiers (IDs) associated with the RAID stripe may then be identified. Rather than performing a traditional RAID recovery/reconstruction approach that is resource intensive in nature and that requires an excessive amount of rebuild time, a more efficient RAID stripe resynchronization process may be performed to restore data associated with the RAID stripe.
-
-
-
-
-
-
-
-
-