Slice file recovery using dead replica slice files

    公开(公告)号:US12014056B2

    公开(公告)日:2024-06-18

    申请号:US17893511

    申请日:2022-08-23

    申请人: NetApp Inc.

    IPC分类号: G06F3/06

    摘要: Techniques are provided for repairing a primary slice file, affected by a storage device error, by using one or more dead replica slice files. The primary slice file is used by a node of a distributed storage architecture as an indirection layer between storage containers (e.g., a volume or LUN) and physical storage where data is physically stored. To improve resiliency of the distributed storage architecture, changes to the primary slice file are replicated to replica slice files hosted by other nodes. If a replica slice file falls out of sync with the primary slice file, then the replica slice file is considered dead (out of sync) and could potentially comprise stale data. If a storage device error affects blocks storing data of the primary slice file, then the techniques provided herein can repair the primary slice file using non-stale data from one or more dead replica slice files.

    Mapping logical identifiers using multiple identifier spaces

    公开(公告)号:US10515055B2

    公开(公告)日:2019-12-24

    申请号:US14859009

    申请日:2015-09-18

    申请人: NetApp, Inc.

    摘要: It is determined that a first data unit is to be written to a storage device and that the first data unit is associated with a first attribute. In response to determining that the first data unit is associated with the first attribute, a first identifier is selected from a first identifier space and the first identifier is associated with the first data unit. It is determined that a second data unit is to be written to the storage device and that the second data unit is associated with the second attribute. In response to determining that the second data unit is associated with the second attribute, a second identifier is selected from a second identifier space and the second identifier is associated with the second data unit.

    System and method for estimating storage savings from deduplication
    3.
    发明授权
    System and method for estimating storage savings from deduplication 有权
    用于估算重复数据删除存储节省的系统和方法

    公开(公告)号:US09152333B1

    公开(公告)日:2015-10-06

    申请号:US13768191

    申请日:2013-02-15

    申请人: NetApp, Inc.

    IPC分类号: G06F12/00 G06F3/06

    摘要: Techniques for a method of estimating deduplication potential are disclosed herein. The method includes steps of selecting randomly a plurality of data blocks from a data set as a sample of the data set, collecting fingerprints of the plurality of data blocks of the sample, identifying duplicates of fingerprints of the sample from the fingerprints of the plurality of data blocks, estimating a total number of unique fingerprints of the data set depending on a total number of the duplicates of fingerprints of the sample based on a probability of fingerprints from the data set colliding in the sample, and determining a total number of duplicates of fingerprints of the data set depending on the total number of the unique fingerprints of the data set.

    摘要翻译: 本文中公开了一种估算重复数据消除潜力的方法。 该方法包括以下步骤:从作为数据集的样本的数据集中随机选择多个数据块,收集样本的多个数据块的指纹,从多个数据集的指纹中识别样本的指纹的重复 数据块,基于来自与样本相冲突的数据集的指纹的概率,根据所述样本的指纹的副本的总数来估计所述数据集的唯一指纹的总数,并且确定所述样本的副本的总数 取决于数据集的唯一指纹的总数的数据集的指纹。

    SLICE FILE RECOVERY USING DEAD REPLICA SLICE FILES

    公开(公告)号:US20240069743A1

    公开(公告)日:2024-02-29

    申请号:US17893511

    申请日:2022-08-23

    申请人: NetApp Inc.

    IPC分类号: G06F3/06

    摘要: Techniques are provided for repairing a primary slice file, affected by a storage device error, by using one or more dead replica slice files. The primary slice file is used by a node of a distributed storage architecture as an indirection layer between storage containers (e.g., a volume or LUN) and physical storage where data is physically stored. To improve resiliency of the distributed storage architecture, changes to the primary slice file are replicated to replica slice files hosted by other nodes. If a replica slice file falls out of sync with the primary slice file, then the replica slice file is considered dead (out of sync) and could potentially comprise stale data. If a storage device error affects blocks storing data of the primary slice file, then the techniques provided herein can repair the primary slice file using non-stale data from one or more dead replica slice files.

    SLICE FILE RECOVERY USING DEAD REPLICA SLICE FILES

    公开(公告)号:US20240338128A1

    公开(公告)日:2024-10-10

    申请号:US18744814

    申请日:2024-06-17

    申请人: NetApp, Inc.

    IPC分类号: G06F3/06

    摘要: Techniques are provided for repairing a primary slice file, affected by a storage device error, by using one or more dead replica slice files. The primary slice file is used by a node of a distributed storage architecture as an indirection layer between storage containers (e.g., a volume or LUN) and physical storage where data is physically stored. To improve resiliency of the distributed storage architecture, changes to the primary slice file are replicated to replica slice files hosted by other nodes. If a replica slice file falls out of sync with the primary slice file, then the replica slice file is considered dead (out of sync) and could potentially comprise stale data. If a storage device error affects blocks storing data of the primary slice file, then the techniques provided herein can repair the primary slice file using non-stale data from one or more dead replica slice files.

    MAPPING LOGICAL IDENTIFIERS USING MULTIPLE IDENTIFIER SPACES

    公开(公告)号:US20170083537A1

    公开(公告)日:2017-03-23

    申请号:US14859009

    申请日:2015-09-18

    申请人: NetApp, Inc.

    IPC分类号: G06F17/30

    摘要: It is determined that a first data unit is to be written to a storage device and that the first data unit is associated with a first attribute. In response to determining that the first data unit is associated with the first attribute, a first identifier is selected from a first identifier space and the first identifier is associated with the first data unit. It is determined that a second data unit is to be written to the storage device and that the second data unit is associated with the second attribute. In response to determining that the second data unit is associated with the second attribute, a second identifier is selected from a second identifier space and the second identifier is associated with the second data unit.

    SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS PROVIDING CHANGE LOGGING IN A DEDUPLICATION PROCESS
    8.
    发明申请
    SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS PROVIDING CHANGE LOGGING IN A DEDUPLICATION PROCESS 有权
    系统,方法和计算机程序产品提供更改记录过程

    公开(公告)号:US20150026424A1

    公开(公告)日:2015-01-22

    申请号:US14509892

    申请日:2014-10-08

    申请人: NetApp, Inc.

    IPC分类号: G06F3/06

    摘要: A method performed in a network storage system, the method including receiving a plurality of data blocks at a secondary storage subsystem from a primary storage subsystem, generating a first log that includes a first plurality of entries, one entry for each of the data blocks, in which each entry of the first plurality of entries includes a name for a respective data block and a fingerprint of the respective data block, receiving metadata at the secondary storage subsystem from the primary storage subsystem, the metadata describing relationships between the plurality of blocks and a plurality of files, generating a second log that includes a second plurality of entries, and merging the first log with the second log to generate a change log.

    摘要翻译: 一种在网络存储系统中执行的方法,所述方法包括从主存储子系统在次存储子系统处接收多个数据块,生成包括第一多个条目的第一日志,每个数据块的一个条目, 其中所述第一多个条目的每个条目包括相应数据块的名称和相应数据块的指纹,从所述主存储子系统接收所述辅助存储子系统的元数据,所述元数据描述所述多个块之间的关系和 多个文件,生成包括第二多个条目的第二日志,以及将第一日志与第二日志合并以生成改变日志。