IN-MEMORY HASH ENTRIES AND HASHES USED TO IMPROVE KEY SEARCH OPERATIONS FOR KEYS OF A KEY VALUE STORE

    公开(公告)号:US20230350810A1

    公开(公告)日:2023-11-02

    申请号:US17732098

    申请日:2022-04-28

    Applicant: NetApp Inc.

    CPC classification number: G06F12/1018

    Abstract: Techniques are provided for implementing a hash building process and an append hash building process. The hash building process builds in-memory hash entries for bins of keys stored within sorted logs of a log structured merge tree used to store keys of a key-value store. The in-memory hash entries can be used to identify the starting locations of bins of keys within the log structured merge tree so that a key within a bin can be searched for from the starting location of the bin as opposed to having to search the entire log structured merge tree. The append hash building process builds two hashes that can be used to more efficiently locate keys and/or ranges of keys within an unsorted append log that would otherwise require a time consuming binary search of the entire append log.

    Dynamic Transitioning of Protection Information in Array Systems
    3.
    发明申请
    Dynamic Transitioning of Protection Information in Array Systems 有权
    阵列系统中保护信息的动态转换

    公开(公告)号:US20160378363A1

    公开(公告)日:2016-12-29

    申请号:US14746938

    申请日:2015-06-23

    Applicant: NetApp, Inc.

    Abstract: A system, method, and computer program product is described for providing dynamic enabling and/or disabling of protection information (PI) in array systems during operation. A storage system receives a request to transition a volume from PI disabled to PI enabled during regular operation. The storage system synchronizes and purges the cache associated with the target volume. The storage system initiates an immediate availability format (IAF-PI) process to initialize PI for the associated data blocks of the volume's storage devices. The storage system continues receiving I/O requests as the IAF-PI process sweeps through the storage devices. The storage system inserts and checks PI for the write data as it is written to the storage devices. The storage system inserts PI for requested data above the IAF-PI boundary and checks PI for requested data below the IAF-PI boundary. The transition remains an online process that avoids downtime.

    Abstract translation: 描述了在操作期间在阵列系统中提供动态启用和/或禁用保护信息(PI)的系统,方法和计算机程序产品。 存储系统在正常操作期间接收到将从禁用的PI转换到启用PI的卷的请求。 存储系统同步并清除与目标卷相关联的高速缓存。 存储系统启动即时可用性格式(IAF-PI)过程,以初始化卷的存储设备的相关数据块的PI。 当IAF-PI进程扫描存储设备时,存储系统继续接收I / O请求。 存储系统在将写入数据写入存储设备时插入并检查PI。 存储系统为IAF-PI边界上方的请求数据插入PI,并检查PI在IAF-PI边界下方的请求数据。 过渡仍然是一个在线过程,避免停机。

    USE OF CLUSTER-LEVEL REDUNDANCY WITHIN A CLUSTER OF A DISTRIBUTED STORAGE MANAGEMENT SYSTEM TO ADDRESS NODE-LEVEL ERRORS

    公开(公告)号:US20250094295A1

    公开(公告)日:2025-03-20

    申请号:US18962013

    申请日:2024-11-27

    Applicant: NetApp, Inc.

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, an instance of a key-value (KV) store of a first node of a plurality of nodes of a cluster of a distributed storage system manages storage of data blocks as values and corresponding block identifiers (IDs) as keys. A list of missing block IDs that are in use for one or more volumes associated with the first node but that are missing from the instance of the KV store are identified by performing a data integrity check on the instance of the KV store. After identifying the list of missing block IDs, instead of treating the first node as failed, restoring the missing block IDs by writing redundant data blocks retrieved from other nodes within the cluster to the first node.

    Prefetching keys for garbage collection

    公开(公告)号:US12204800B2

    公开(公告)日:2025-01-21

    申请号:US17732065

    申请日:2022-04-28

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for implementing a garbage collection process and a prediction read ahead mechanism to prefetch keys into memory to improve the efficiency and speed of the garbage collection process. A log structured merge tree is used to store keys of key-value pairs within a key-value store. If a key is no longer referenced by any worker nodes of a distributed storage architecture, then the key can be freed to store other data. Accordingly, garbage collection is performed to identify and free unused keys. The speed and efficiency of garbage collection is improved by dynamically adjusting the amount and rate at which keys are prefetched from disk and cached into faster memory for processing by the garbage collection process.

    Use of cluster-level redundancy within a cluster of a distributed storage management system to address node-level errors

    公开(公告)号:US12164397B2

    公开(公告)日:2024-12-10

    申请号:US18478149

    申请日:2023-09-29

    Applicant: NetApp, Inc.

    Abstract: Systems and methods that make use of cluster-level redundancy within a distributed storage management system to address various node-level error scenarios are provided. According to one embodiment, a first node of multiple nodes of distributed storage system represented in a form of a cluster of the multiple of nodes, identifies the potential existence of an error associated with a Redundant Array of Independent Disks (RAID) stripe. A list of block identifiers (IDs) associated with the RAID stripe may then be identified. Rather than performing a traditional RAID recovery/reconstruction approach that is resource intensive in nature and that requires an excessive amount of rebuild time, a more efficient RAID stripe resynchronization process may be performed to restore data associated with the RAID stripe.

    DEFRAGMENTATION FOR LOG STRUCTURED MERGE TREE TO IMPROVE READ AND WRITE AMPLIFICATION

    公开(公告)号:US20240281411A1

    公开(公告)日:2024-08-22

    申请号:US18648989

    申请日:2024-04-29

    Applicant: NetApp, Inc.

    CPC classification number: G06F16/1748 G06F16/182

    Abstract: Techniques are provided for implementing a defragmentation process during a merge operation performed by a re-compaction process upon a log structured merge tree. The log structured merge tree is used to store keys of key-value pairs within a key-value store. As the log structured merge tree fills with keys over time, the re-compaction process is performed to merge keys down to lower levels of the log structured merge tree to re-compact the keys. Re-compaction can result in fragmentation because there is a lack of spatial locality of where the re-compaction operations re-writes the keys within storage. Fragmentation increases read and write amplification when accessing the keys stored in different locations within the storage. Accordingly, the defragmentation process is performed during a last merge operation of the re-compaction process in order to store keys together within the storage, thus reducing read and write amplification when accessing the keys.

    GARBAGE COLLECTION AND BIN SYNCHRONIZATION FOR DISTRIBUTED STORAGE ARCHITECTURE

    公开(公告)号:US20240220165A1

    公开(公告)日:2024-07-04

    申请号:US18607665

    申请日:2024-03-18

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for implementing garbage collection and bin synchronization for a distributed storage architecture of worker nodes managing distributed storage composed of bins of blocks. As the distributed storage architecture scales out to accommodate more storage and worker nodes, garbage collection used to free unused blocks becomes unmanageable and slow. Accordingly garbage collection is improved by utilizing heuristics to dynamically speed up or down garbage collection and set sizes for subsets of a bin to process instead of the entire bin. This ensures that garbage collection does not use stale information about what blocks are in-use, and ensures garbage collection does not unduly impact client I/O processing or conversely falls behind on garbage collection. Garbage collection can be incorporated into a bin sync process to improve the efficiency of the bin sync process so that unused blocks are not needlessly copied by the bin sync process.

    Adjustment of garbage collection parameters in a storage system

    公开(公告)号:US11816029B2

    公开(公告)日:2023-11-14

    申请号:US17691588

    申请日:2022-03-10

    Applicant: NetApp, Inc.

    Abstract: A system, method, and machine-readable storage medium for performing garbage collection in a distributed storage system are provided. In some embodiments, an efficiency level of a garbage collection process is monitored. The garbage collection process may include removal of one or more data blocks of a set of data blocks that is referenced by a set of content identifiers. The set of slice services and the set of data blocks may reside in a cluster, and a set of probabilistic filters (e.g., Bloom filters) may indicate whether the set of data blocks is in-use. At least one parameter of a probabilistic filter of the set of probabilistic filters may be adjusted (e.g., increased or reduced) if the efficiency level is below the efficiency threshold. Garbage collection may be performed on the set of data blocks in accordance with the set of probabilistic filters.

Patent Agency Ranking