Efficient recovery of erasure coded data

    公开(公告)号:US10353740B2

    公开(公告)日:2019-07-16

    申请号:US15589909

    申请日:2017-05-08

    Applicant: NetApp, Inc.

    Abstract: To efficiently recover from a multiple storage node failure, a storage node concurrently restores data fragments to the multiple failed storage nodes, as opposed to restoring each node individually. In the VCS based storage technique, storage nodes are restored as part of an ECG repair process. For each ECG being repaired, a storage node performing the restoration process reads data fragments from active nodes in the ECG and generates new data fragments to replace any lost data fragments. The node then stores one of the new data fragments across each of the failed storage nodes. By concurrently restoring data fragments to each failed storage node, the data fragments needed to repair each ECG are only read once, thereby preserving disk operations and network bandwidth.

    VIRTUAL CHUNK SERVICE BASED DATA RECOVERY IN A DISTRIBUTED DATA STORAGE SYSTEM
    14.
    发明申请
    VIRTUAL CHUNK SERVICE BASED DATA RECOVERY IN A DISTRIBUTED DATA STORAGE SYSTEM 有权
    分布式数据存储系统中基于虚拟服务的数据恢复

    公开(公告)号:US20160246677A1

    公开(公告)日:2016-08-25

    申请号:US14696001

    申请日:2015-04-24

    Applicant: NetApp, Inc.

    Abstract: Technology is disclosed for storing data in a distributed storage system using a virtual chunk service (VCS). In the VCS based storage technique, a storage node (“node”) is split into multiple VCSs and each of the VCSs can be assigned a unique ID in the distributed storage. A set of VCSs from a set of nodes form a storage group, which also can be assigned a unique ID in the distributed storage. When a data object is received for storage, a storage group is identified for the data object, the data object is encoded to generate multiple fragments and each fragment is stored in a VCS of the identified storage group. The data recovery process is made more efficient by using metadata, e.g., VCS to storage node mapping, storage group to VCS mapping, VCS to objects mapping, which eliminates resource intensive read and write operations during recovery.

    Abstract translation: 公开了用于使用虚拟块服务(VCS)在分布式存储系统中存储数据的技术。 在基于VCS的存储技术中,存储节点(“节点”)被分成多个VCS,并且可以在分布式存储器中为每个VCS分配唯一的ID。 来自一组节点的一组VCS形成存储组,也可以在分布式存储器中分配唯一的ID。 当接收到数据对象进行存储时,为数据对象识别存储组,对数据对象进行编码以生成多个片段,并将每个片段存储在所识别的存储组的VCS中。 通过使用元数据(例如VCS到存储节点映射,存储组到VCS映射,VCS到对象映射),可以使数据恢复过程更加高效,从而消除了恢复期间的资源密集型读写操作。

    METHODS FOR POLICY-BASED DATA TIERING USING A CLOUD ARCHITECTURE AND DEVICES THEREOF
    15.
    发明申请
    METHODS FOR POLICY-BASED DATA TIERING USING A CLOUD ARCHITECTURE AND DEVICES THEREOF 有权
    使用云结构的基于策略数据的方法及其设备

    公开(公告)号:US20160246517A1

    公开(公告)日:2016-08-25

    申请号:US14627034

    申请日:2015-02-20

    Applicant: NetApp, Inc.

    Abstract: A method, non-transitory computer readable medium, and storage platform computing apparatus that obtains a lifecycle management policy and configuration information for a cloud repository identified in the lifecycle management policy. The configuration information includes at least one access parameter for the cloud repository. The lifecycle management policy is applied to determine when an object is required to be replicated to the cloud repository in response to a received write request. A request to store the object in the cloud repository is generated, when the object is determined to be required to be stored in the cloud repository, wherein the request includes the access parameter. The request is sent to the cloud repository using a representational state transfer (REST) interface associated with the cloud repository.

    Abstract translation: 一种方法,非暂时性计算机可读介质和存储平台计算装置,其获取生命周期管理策略中标识的云库的生命周期管理策略和配置信息。 配置信息包括用于云存储库的至少一个访问参数。 应用生命周期管理策略来确定何时需要将对象复制到云存储库以响应接收到的写入请求。 当确定对象被要求存储在云存储库中时,生成将对象存储在云存储库中的请求,其中请求包括访问参数。 该请求使用与云库相关联的表示状态转移(REST)接口发送到云存储库。

    Erasure coding repair availability
    18.
    发明授权

    公开(公告)号:US10558538B2

    公开(公告)日:2020-02-11

    申请号:US15820518

    申请日:2017-11-22

    Applicant: NetApp, Inc.

    Abstract: Distributed storage systems frequently use a centralized metadata repository that stores metadata in an eventually consistent distributed database. However, a metadata repository cannot be relied upon for determining which erasure coded fragments are lost because of a storage node(s) failures. Instead, when recovering a failed storage node, a list of missing fragments is generated based on fragments stored in storage devices of available storage nodes. A storage node performing the recovery sends a request to one or more of the available storage nodes for a fragment list. The fragment list is generated, not based on a metadata database, but on scanning storage devices for fragments related to the failed storage node. The storage node performing the recovery merges retrieved lists to create a master list indicating fragments that should be regenerated for recovery of the failed storage node(s).

    SPACE RESERVATION FOR DISTRIBUTED STORAGE SYSTEMS

    公开(公告)号:US20190332304A1

    公开(公告)日:2019-10-31

    申请号:US16505339

    申请日:2019-07-08

    Applicant: NETAPP, INC.

    Abstract: Techniques are described for reserving space on a destination node or volume for increasing the likelihood of a successful data transfer in a distributed storage environment. A reservation may be retried at one or more destinations if the reservation fails at a first destination. In some embodiments, the data-transfer process can be paused or terminated prior to data being transferred to one or more destinations if a reservation fails. Reserving space on a destination node or volume can increase the likelihood of a successful data transfer, which can increase the likelihood of efficient resources usage in a storage system.

    MANAGER ELECTION FOR ERASURE CODING GROUPS
    20.
    发明申请

    公开(公告)号:US20190251009A1

    公开(公告)日:2019-08-15

    申请号:US16391842

    申请日:2019-04-23

    Applicant: NetApp, Inc.

    Abstract: To ensure that there is an elected manager among storage nodes of an erasure coding group (“ECG”), an ECG manager (“ECGM”) election process is periodically performed among available storage nodes that are configured with the software to perform the services of an ECGM. When a storage node is activated, an ECGM process of the storage node begins executing and is assigned a process identifier (“PID”). A storage node can utilize a service query framework to identify other available storage nodes and retrieve their ECGM PIDs. The storage node then selects a PID according to a criterion and elects the storage node corresponding to the selected PID to be the acting ECGM. This process is performed periodically, so even if the acting ECGM storage node fails, a new ECGM is eventually selected from the available storage nodes.

Patent Agency Ranking