GARBAGE COLLECTION AND BIN SYNCHRONIZATION FOR DISTRIBUTED STORAGE ARCHITECTURE

    公开(公告)号:US20230325116A1

    公开(公告)日:2023-10-12

    申请号:US17717469

    申请日:2022-04-11

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for implementing garbage collection and bin synchronization for a distributed storage architecture of worker nodes managing distributed storage composed of bins of blocks. As the distributed storage architecture scales out to accommodate more storage and worker nodes, garbage collection used to free unused blocks becomes unmanageable and slow. Accordingly garbage collection is improved by utilizing heuristics to dynamically speed up or down garbage collection and set sizes for subsets of a bin to process instead of the entire bin. This ensures that garbage collection does not use stale information about what blocks are in-use, and ensures garbage collection does not unduly impact client I/O processing or conversely falls behind on garbage collection. Garbage collection can be incorporated into a bin sync process to improve the efficiency of the bin sync process so that unused blocks are not needlessly copied by the bin sync process.

    GARBAGE COLLECTION AND BIN SYNCHRONIZATION FOR DISTRIBUTED STORAGE ARCHITECTURE

    公开(公告)号:US20230325081A1

    公开(公告)日:2023-10-12

    申请号:US17717454

    申请日:2022-04-11

    Applicant: NetApp Inc.

    CPC classification number: G06F3/0608 G06F3/067 G06F3/0652

    Abstract: Techniques are provided for implementing garbage collection and bin synchronization for a distributed storage architecture of worker nodes managing distributed storage composed of bins of blocks. As the distributed storage architecture scales out to accommodate more storage and worker nodes, garbage collection used to free unused blocks becomes unmanageable and slow. Accordingly garbage collection is improved by utilizing heuristics to dynamically speed up or down garbage collection and set sizes for subsets of a bin to process instead of the entire bin. This ensures that garbage collection does not use stale information about what blocks are in-use, and ensures garbage collection does not unduly impact client I/O processing or conversely falls behind on garbage collection. Garbage collection can be incorporated into a bin sync process to improve the efficiency of the bin sync process so that unused blocks are not needlessly copied by the bin sync process.

    ADJUSTMENT OF GARBAGE COLLECTION PARAMETERS IN A STORAGE SYSTEM

    公开(公告)号:US20220197789A1

    公开(公告)日:2022-06-23

    申请号:US17691588

    申请日:2022-03-10

    Applicant: NetApp, Inc.

    Abstract: A system, method, and machine-readable storage medium for performing garbage collection in a distributed storage system are provided. In some embodiments, an efficiency level of a garbage collection process is monitored. The garbage collection process may include removal of one or more data blocks of a set of data blocks that is referenced by a set of content identifiers. The set of slice services and the set of data blocks may reside in a cluster, and a set of probabilistic filters (e.g., Bloom filters) may indicate whether the set of data blocks is in-use. At least one parameter of a probabilistic filter of the set of probabilistic filters may be adjusted (e.g., increased or reduced) if the efficiency level is below the efficiency threshold. Garbage collection may be performed on the set of data blocks in accordance with the set of probabilistic filters.

    Data tracking for efficient recovery of a storage array
    14.
    发明授权
    Data tracking for efficient recovery of a storage array 有权
    数据跟踪,有效恢复存储阵列

    公开(公告)号:US09547552B2

    公开(公告)日:2017-01-17

    申请号:US14567743

    申请日:2014-12-11

    Applicant: NetApp, Inc.

    Abstract: A system and method for maintaining operation of a storage array with one or more failed storage devices and for quickly recovering when failing devices are replaced are provided. In some embodiments, the method includes receiving a data transaction directed to a volume and determining that a storage device associated with the volume is inoperable. In response to determining that the storage device is inoperable, a data extent is recorded in a change log in a storage controller cache. The data extent is associated with the data transaction and allocated to the storage device that is inoperable. The data transaction is performed using at least one other storage device associated with the volume, and data allocated to the storage device is subsequently reconstructed using the recorded data extent.

    Abstract translation: 提供了一种用于维护具有一个或多个故障存储设备的存储阵列的操作并且用于在更换故障设备时快速恢复的系统和方法。 在一些实施例中,该方法包括接收指向卷的数据事务,并确定与该卷相关联的存储设备是不可操作的。 响应于确定存储设备不可操作,数据范围被记录在存储控制器高速缓存中的更改日志中。 数据范围与数据事务相关联,并分配给不可操作的存储设备。 使用与卷相关联的至少一个其他存储设备来执行数据事务,并且随后使用所记录的数据扩展来重构分配给存储设备的数据。

    Data access request monitoring to reduce system resource use for background operations
    15.
    发明授权
    Data access request monitoring to reduce system resource use for background operations 有权
    数据访问请求监视,以减少背景操作的系统资源使用

    公开(公告)号:US09367245B2

    公开(公告)日:2016-06-14

    申请号:US13871783

    申请日:2013-04-26

    Applicant: NetApp, Inc.

    Abstract: An I/O processing stack includes a proxy that can provide processing services for access requests to initialized and uninitialized storage regions. For a write request, the proxy stores write information in a write metadata repository. If the write is requested for an address in an initialized storage region of the storage system, the proxy performs a write to the initialized region based on region information in the write I/O access request. If the write is requested for an address in an uninitialized storage region of the storage system, the proxy performs an on-demand initialization of the storage region and then performs a write to the storage region based on region information provided by the proxy.

    Abstract translation: I / O处理堆栈包括可以为初始化和未初始化的存储区域的访问请求提供处理服务的代理。 对于写入请求,代理将写入信息存储在写入元数据存储库中。 如果对存储系统的初始化存储区域中的地址请求写入,则代理根据写入I / O访问请求中的区域信息执行对初始化区域的写入。 如果对存储系统的未初始化存储区域中的地址请求写入,则代理执行存储区域的按需初始化,然后基于由代理提供的区域信息对存储区域进行写入。

    Background initialization for protection information enabled storage volumes
    16.
    发明授权
    Background initialization for protection information enabled storage volumes 有权
    保护信息启用存储卷的后台初始化

    公开(公告)号:US09235471B2

    公开(公告)日:2016-01-12

    申请号:US13956013

    申请日:2013-07-31

    Applicant: NetApp, Inc.

    Abstract: Technology is disclosed for performing background initialization on protection information enabled storage volumes or drives. In some embodiments, a storage controller generates multiple I/O requests for stripe segments of each drive (e.g., disk) of multiple drives of a RAID-based system (e.g., RAID-based disk array). The I/O requests are then sorted for each of the drives according to a pre-determined arrangement and initiated in parallel to the disks while enforcing the pre-determined arrangement. Sorting and issuing the I/O requests in the manner described herein can, for example, reduce drive head movement resulting in faster storage subsystem initialization.

    Abstract translation: 公开了用于在启用保护信息的存储卷或驱动器上执行后台初始化的技术。 在一些实施例中,存储控制器为基于RAID的系统(例如,基于RAID的磁盘阵列)的多个驱动器的每个驱动器(例如,磁盘)的条带分段生成多个I / O请求。 然后根据预定的布置对每个驱动器对I / O请求进行排序,并且在执行预定的布置的同时并行地发送到磁盘。 以本文所描述的方式排序和发布I / O请求可以例如减少驱动头移动,从而导致更快的存储子系统初始化。

    Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors

    公开(公告)号:US12253920B2

    公开(公告)日:2025-03-18

    申请号:US18608742

    申请日:2024-03-18

    Applicant: NetApp, Inc.

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, after identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.

    Garbage collection and bin synchronization for distributed storage architecture

    公开(公告)号:US11941297B2

    公开(公告)日:2024-03-26

    申请号:US17717469

    申请日:2022-04-11

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for implementing garbage collection and bin synchronization for a distributed storage architecture of worker nodes managing distributed storage composed of bins of blocks. As the distributed storage architecture scales out to accommodate more storage and worker nodes, garbage collection used to free unused blocks becomes unmanageable and slow. Accordingly garbage collection is improved by utilizing heuristics to dynamically speed up or down garbage collection and set sizes for subsets of a bin to process instead of the entire bin. This ensures that garbage collection does not use stale information about what blocks are in-use, and ensures garbage collection does not unduly impact client I/O processing or conversely falls behind on garbage collection. Garbage collection can be incorporated into a bin sync process to improve the efficiency of the bin sync process so that unused blocks are not needlessly copied by the bin sync process.

    USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS

    公开(公告)号:US20230153213A1

    公开(公告)日:2023-05-18

    申请号:US17680621

    申请日:2022-02-25

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/1662 G06F11/3034 G06F16/27

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than making use of a generalized one-size-fits-all approach in an effort to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identification of a failed RAID stripe by a node of a cluster of a distributed storage management system, for each block ID of multiple block IDs associated with the failed RAID stripe, a data block is restored corresponding to the block ID by reading the data block from another node of the cluster having a redundant copy of the data block; and writing the redundant copy of the data block to a storage area of the node that is unaffected by the failed RAID stripe.

    USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS

    公开(公告)号:US20230152986A1

    公开(公告)日:2023-05-18

    申请号:US17680631

    申请日:2022-02-25

    Applicant: NetApp, Inc.

    CPC classification number: G06F3/0622 G06F3/064 G06F3/0679

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. Rather than using a generalized one-size-fits-all approach to reduce complexity, an approach tailored to the node-level error scenario at issue may be performed to avoid doing more than necessary. According to one embodiment, responsive to identifying a missing branch of a tree implemented by a KV store of a first node of a cluster of a distributed storage management system, a branch resynchronization process may be performed, including, for each block ID in the range of block IDs of the missing branch (i) reading a data block corresponding to the block ID from a second node of the cluster that maintains redundant information relating to the block ID; and (ii) restoring the block ID within the KV store by writing the data block to the first node.

Patent Agency Ranking