USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS

    公开(公告)号:US20250094295A1

    公开(公告)日:2025-03-20

    申请号:US18962013

    申请日:2024-11-27

    Applicant: NetApp, Inc.

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, an instance of a key-value (KV) store of a first node of a plurality of nodes of a cluster of a distributed storage system manages storage of data blocks as values and corresponding block identifiers (IDs) as keys. A list of missing block IDs that are in use for one or more volumes associated with the first node but that are missing from the instance of the KV store are identified by performing a data integrity check on the instance of the KV store. After identifying the list of missing block IDs, instead of treating the first node as failed, restoring the missing block IDs by writing redundant data blocks retrieved from other nodes within the cluster to the first node.

    Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors

    公开(公告)号:US12164397B2

    公开(公告)日:2024-12-10

    申请号:US18478149

    申请日:2023-09-29

    Applicant: NetApp, Inc.

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a first node of multiple nodes of distributed storage system represented in a form of a cluster of the multiple of nodes, identifies the potential existence of an error associated with a Redundant Array of Independent Disks (RAID) stripe. A list of block identifiers (IDs) associated with the RAID stripe may then be identified. Rather than performing a traditional RAID recovery/reconstruction approach that is resource intensive in nature and that requires an excessive amount of rebuild time, a more efficient RAID stripe resynchronization process may be performed to restore data associated with the RAID stripe.

    Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors

    公开(公告)号:US12253920B2

    公开(公告)日:2025-03-18

    申请号:US18608742

    申请日:2024-03-18

    Applicant: NetApp, Inc.

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, after identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.

    USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS

    公开(公告)号:US20230153213A1

    公开(公告)日:2023-05-18

    申请号:US17680621

    申请日:2022-02-25

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/1662 G06F11/3034 G06F16/27

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than making use of a generalized one-size-fits-all approach in an effort to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identification of a failed RAID stripe by a node of a cluster of a distributed storage management system, for each block ID of multiple block IDs associated with the failed RAID stripe, a data block is restored corresponding to the block ID by reading the data block from another node of the cluster having a redundant copy of the data block; and writing the redundant copy of the data block to a storage area of the node that is unaffected by the failed RAID stripe.

    USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS

    公开(公告)号:US20230152986A1

    公开(公告)日:2023-05-18

    申请号:US17680631

    申请日:2022-02-25

    Applicant: NetApp, Inc.

    CPC classification number: G06F3/0622 G06F3/064 G06F3/0679

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.

Patent Agency Ranking