METHODS FOR FACILITATING BATCH ANALYTICS ON ARCHIVED DATA AND DEVICES THEREOF
    1.
    发明申请
    METHODS FOR FACILITATING BATCH ANALYTICS ON ARCHIVED DATA AND DEVICES THEREOF 有权
    用于对存档数据进行批处理分析的方法及其装置

    公开(公告)号:US20160070766A1

    公开(公告)日:2016-03-10

    申请号:US14480151

    申请日:2014-09-08

    Applicant: NetApp, Inc.

    CPC classification number: G06F17/30073

    Abstract: A method, non-transitory computer readable medium, and archive node computing device that receives an indication of each of a plurality of archived files required to service a job from one of a plurality of compute node computing devices of an analytics tier. An optimized schedule for retrieving the archived files from one or more archive storage devices of an archive tier is generated. The optimized schedule is provided to the one of the plurality of compute node computing devices. Requests for the archived files received from the one of the plurality of compute node computing device and at least one other of the plurality of compute node computing devices, wherein the requests are sent according to the optimized schedule.

    Abstract translation: 一种方法,非暂时性计算机可读介质和归档节点计算设备,其接收从分析层的多个计算节点计算设备之一服务作业所需的多个归档文件中的每一个的指示。 生成从归档层的一个或多个归档存储设备检索归档文件的优化计划。 将优化的调度提供给多个计算节点计算设备中的一个。 从多个计算节点计算设备中的一个计算设备和多个计算节点计算设备中的至少一个计算设备接收的归档文件的请求,其中根据优化的调度发送请求。

    MIGRATING DEDUPLICATED DATA
    2.
    发明申请
    MIGRATING DEDUPLICATED DATA 有权
    迁移重复数据

    公开(公告)号:US20140114933A1

    公开(公告)日:2014-04-24

    申请号:US13655287

    申请日:2012-10-18

    Applicant: NetApp, Inc.

    Abstract: Methods and apparatuses for efficiently migrating deduplicated data are provided. In one example, a data management system includes a data storage volume, a memory including machine executable instructions, and a computer processor. The data storage volume includes data objects and free storage space. The computer processor executes the instructions to perform deduplication of the data objects and determine migration efficiency metrics for groups of the data objects. Determining the migration efficiency metrics includes determining, for each group, a relationship between the free storage space that will result if the group is migrated from the volume and the resources required to migrate the group from the volume.

    Abstract translation: 提供了有效迁移重复数据删除数据的方法和设备。 在一个示例中,数据管理系统包括数据存储卷,包括机器可执行指令的存储器和计算机处理器。 数据存储卷包括数据对象和空闲存储空间。 计算机处理器执行指令以执行数据对象的重复数据删除,并确定数据对象组的迁移效率度量。 确定迁移效率指标包括为每个组确定如果组从卷迁移而导致的空闲存储空间与从组中迁移组所需的资源之间的关系。

    Deduplicating data for a data storage system using similarity determinations

    公开(公告)号:US09933970B2

    公开(公告)日:2018-04-03

    申请号:US14928848

    申请日:2015-10-30

    Applicant: NetApp, Inc.

    CPC classification number: G06F3/0641 G06F3/0608 G06F3/0686

    Abstract: A method and system for deduplicating data for a data storage system using similarity determinations are described. A tape library is arranged in a hierarchy of tape groups and tape plexes. Tape groups are an admin visible entity and are comprised of multiple tape plexes (at least equal to the number of replicas in a tape group). Tape plexes in turn comprise multiple tape cartridges. Data files and objects received within a time period are initially staged in a disk cache where they are logically segregated into cliques based on their expected deduplication ratios. These cliques are then evaluated for the amount of duplication they have with data existing in tape plexes. Based on the number of replicas being written, the top few tape plexes are selected from within the tape group. The cliques are deduplicated with data on the selected tape plexes, compressed, and written to tape.

    Methods for managing read access of objects in storage media and devices thereof

    公开(公告)号:US09851908B2

    公开(公告)日:2017-12-26

    申请号:US14164028

    申请日:2014-01-24

    Applicant: NetApp, Inc.

    Abstract: A method, device and non-transitory computer readable medium that manages read access includes organizing a plurality of requests for objects on one or more storage media, such as tapes or spin-down disks, based on at least a deadline for each of the plurality of requests. One of one or more replicas for each of the objects on the one or more storage media is selected based on one or more factors. An initial schedule for read access is generated based at least on the deadline for each of the plurality of requests, the selected one of the replicas for each of the objects, and availability of one or more drives. The initial schedule for read access on the one or more of the drives for each of the plurality of requests for the objects is provided.

    Risk based rebuild of data objects in an erasure coded storage system

    公开(公告)号:US10514984B2

    公开(公告)日:2019-12-24

    申请号:US15055484

    申请日:2016-02-26

    Applicant: NetApp, Inc.

    Abstract: A rebuild node of a storage system can assess risk of the storage system not being able to provide a data object. The rebuild node(s) uses information about data object fragments to determine health of a data object, which relates to the risk assessment. The rebuild node obtains object fragment information from nodes throughout the storage system. With the object fragment information, the rebuild node(s) can assess object risk based, at least in part, on the object fragments indicated as existing by the nodes. To assess object risk, the rebuild node(s) treats absent object fragments (i.e., those for which an indication was not received) as lost. When too many object fragments are lost, an object cannot be rebuilt. The erasure coding technique dictates the threshold number of fragments for rebuilding an object. The risk assessment per object influences rebuild of the objects.

    Migrating deduplicated data
    6.
    发明授权
    Migrating deduplicated data 有权
    迁移重复数据删除的数据

    公开(公告)号:US08996478B2

    公开(公告)日:2015-03-31

    申请号:US13655287

    申请日:2012-10-18

    Applicant: NetApp, Inc.

    Abstract: Methods and apparatuses for efficiently migrating deduplicated data are provided. In one example, a data management system includes a data storage volume, a memory including machine executable instructions, and a computer processor. The data storage volume includes data objects and free storage space. The computer processor executes the instructions to perform deduplication of the data objects and determine migration efficiency metrics for groups of the data objects. Determining the migration efficiency metrics includes determining, for each group, a relationship between the free storage space that will result if the group is migrated from the volume and the resources required to migrate the group from the volume.

    Abstract translation: 提供了有效迁移重复数据删除数据的方法和设备。 在一个示例中,数据管理系统包括数据存储卷,包括机器可执行指令的存储器和计算机处理器。 数据存储卷包括数据对象和空闲存储空间。 计算机处理器执行指令以执行数据对象的重复数据删除,并确定数据对象组的迁移效率度量。 确定迁移效率指标包括为每个组确定如果组从卷迁移而导致的空闲存储空间与从组中迁移组所需的资源之间的关系。

    Archive log management for distributed database clusters

    公开(公告)号:US10901958B2

    公开(公告)日:2021-01-26

    申请号:US15965127

    申请日:2018-04-27

    Applicant: NETAPP, INC.

    Abstract: Methods and systems for a distributed database cluster storing a plurality of replicas of a databases are provided. One method includes locating by a processor, a timestamp of a last stored record in a backup copy of the database from a plurality of logical partitions for a point in time restore operation; identifying by the processor, an operation log for each logical partition with the last stored record, the operation log providing transaction details associated with the database; splitting by the processor, the operation log for each logical partition by ignoring transactions that occurred prior to the timestamp of the last stored record; and using by the processor, the split operation log for restoring the database to the point in time.

    DEDUPLICATING DATA FOR A DATA STORAGE SYSTEM USING SIMILARITY DETERMINATIONS

    公开(公告)号:US20170123711A1

    公开(公告)日:2017-05-04

    申请号:US14928848

    申请日:2015-10-30

    Applicant: NetApp, Inc.

    CPC classification number: G06F3/0641 G06F3/0608 G06F3/0686

    Abstract: A method and system for deduplicating data for a data storage system using similarity determinations are described. A tape library is arranged in a hierarchy of tape groups and tape plexes. Tape groups are an admin visible entity and are comprised of multiple tape plexes (at least equal to the number of replicas in a tape group). Tape plexes in turn comprise multiple tape cartridges. Data files and objects received within a time period are initially staged in a disk cache where they are logically segregated into cliques based on their expected deduplication ratios. These cliques are then evaluated for the amount of duplication they have with data existing in tape plexes. Based on the number of replicas being written, the top few tape plexes are selected from within the tape group. The cliques are deduplicated with data on the selected tape plexes, compressed, and written to tape.

    ARCHIVE LOG MANAGEMENT FOR DISTRIBUTED DATABASE CLUSTERS

    公开(公告)号:US20190332692A1

    公开(公告)日:2019-10-31

    申请号:US15965127

    申请日:2018-04-27

    Applicant: NETAPP, INC.

    Abstract: Methods and systems for a distributed database cluster storing a plurality of replicas of a databases are provided. One method includes locating by a processor, a timestamp of a last stored record in a backup copy of the database from a plurality of logical partitions for a point in time restore operation; identifying by the processor, an operation log for each logical partition with the last stored record, the operation log providing transaction details associated with the database; splitting by the processor, the operation log for each logical partition by ignoring transactions that occurred prior to the timestamp of the last stored record; and using by the processor, the split operation log for restoring the database to the point in time.

    RISK BASED REBUILD OF DATA OBJECTS IN AN ERASURE CODED STORAGE SYSTEM

    公开(公告)号:US20170249213A1

    公开(公告)日:2017-08-31

    申请号:US15055484

    申请日:2016-02-26

    Applicant: NetApp, Inc.

    Abstract: A rebuild node of a storage system can assess risk of the storage system not being able to provide a data object. The rebuild node(s) uses information about data object fragments to determine health of a data object, which relates to the risk assessment. The rebuild node obtains object fragment information from nodes throughout the storage system. With the object fragment information, the rebuild node(s) can assess object risk based, at least in part, on the object fragments indicated as existing by the nodes. To assess object risk, the rebuild node(s) treats absent object fragments (i.e., those for which an indication was not received) as lost. When too many object fragments are lost, an object cannot be rebuilt. The erasure coding technique dictates the threshold number of fragments for rebuilding an object. The risk assessment per object influences rebuild of the objects.

Patent Agency Ranking