Scalable, distributed, fault-tolerant test framework

    公开(公告)号:US09720818B2

    公开(公告)日:2017-08-01

    申请号:US14844795

    申请日:2015-09-03

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/3688

    Abstract: A testing framework has been developed to address these issues that takes common functionality normally imported by the testing scripts on the client device and instead splits the functionality into standalone, fault tolerant, scalable services. Accordingly, the scripts can utilize the functionality through APIs and therefore test drivers executing a test or building a test environment or other testing processes may access the services through an API. Therefore, each testing client and test driver does not need to separately import the functionality and run the functionality on the memory of the client device separately. Rather, multiple tests can use these functionalities, allowing the testing services to be scaled between tests.

    Techniques for performing resynchronization on a clustered system

    公开(公告)号:US09720752B2

    公开(公告)日:2017-08-01

    申请号:US14518422

    申请日:2014-10-20

    Applicant: NETAPP, INC.

    Abstract: Various embodiments are generally directed an apparatus and method for receiving information to write on a clustered system comprising at least a first cluster and a second cluster, determining that a failure event has occurred on the clustered system creating unsynchronized information, the unsynchronized information comprising at least one of inflight information and dirty region information, and performing a resynchronization operation to synchronize the unsynchronized information on the first cluster and the second cluster based on log information in at least one of an inflight tracker log for the inflight information and a dirty region log for the dirty region information.

    Cluster configuration information replication

    公开(公告)号:US09720626B2

    公开(公告)日:2017-08-01

    申请号:US14491879

    申请日:2014-09-19

    Applicant: NetApp Inc.

    CPC classification number: G06F3/067 G06F3/0617 G06F3/0629

    Abstract: One or more techniques and/or systems are provided for cluster configuration information replication, managing cluster-wide service agents, and/or for cluster-wide outage detection. In an example of cluster configuration information replication, a replication workflow corresponding to a storage operation implemented for a storage object (e.g., renaming of a volume) of a first cluster may be transferred to a second storage cluster for selectively implementation. In an example of managing cluster-wide service agents, cluster-wide service agents are deployed to nodes of a cluster storage environment, where a master agent actively processes cluster service calls and standby agents passively wait for reassignment as a failover master in the event the master agent fails. In an example of cluster-wide outage detection, a cluster-wide outage may be determined for a cluster storage environment based upon a number of inaccessible nodes satisfying a cluster outage detection metric.

    Methods to identify, handle and recover from suspect SSDS in a clustered flash array

    公开(公告)号:US09710317B2

    公开(公告)日:2017-07-18

    申请号:US14673258

    申请日:2015-03-30

    Applicant: NetApp, Inc.

    Abstract: A technique predicts failure of one or more storage devices of a storage array serviced by a storage system and for establishes one or more threshold conditions for replacing the storage devices. The predictive technique periodically monitors soft and hard failures of the storage devices (e.g., from Self-Monitoring, Analysis and Reporting Technology), as well as various usage counters pertaining to input/output (I/O) workloads and response times of the storage devices. A heuristic procedure may be performed that combines the monitored results to calculate the predicted failure and recommend replacement of the storage devices, using one or more thresholds based on current usage and failure patterns of the storage devices. In addition, one or more policies may be provided for replacing the storage devices in a cost-effective manner that ensures non-disruptive operation and/or replacement of the SSDs, while obviating a potential catastrophic scenario based on the usage and failure patterns of the storage devices.

    Distributed control protocol for high availability in multi-node storage cluster

    公开(公告)号:US09692645B2

    公开(公告)日:2017-06-27

    申请号:US14244337

    申请日:2014-04-03

    Applicant: NetApp, Inc.

    Abstract: A distributed control protocol dynamically establishes high availability (HA) partner relationships for nodes in a cluster. A HA partner relationship may be established by copying (mirroring) information maintained in a non-volatile random access memory (NVRAM) of a node over a HA interconnect to the NVRAM of a partner node in the cluster. The distributed control protocol leverages a Cluster Liveliness and Availability Manager (CLAM) utility of a storage operating system executing on the nodes to rebalance NVRAM mirroring and alter HA partner relationships of the nodes in the cluster. The CLAM utility is configured to maintain various cluster related issues, such as CLAM quorum events, addition or subtraction of a node in the cluster and other changes in configuration of the cluster. Notably, the CLAM utility is an event based manager that implements the control protocol to keep the nodes informed of any cluster changes through event generation and propagation.

    Namespace mirroring in an expandable storage volume

    公开(公告)号:US09684571B2

    公开(公告)日:2017-06-20

    申请号:US13875236

    申请日:2013-05-01

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/20 G06F11/00 G06F11/1435

    Abstract: Technology for maintaining a backup of namespace metadata of an expandable storage volume is disclosed. In various embodiments, the expandable storage volume backs up metadata of a namespace constituent volume of the expandable storage volume into a namespace mirror volume. The namespace constituent volume is responsible for storing the metadata for data objects stored in multiple data constituent volumes of the expandable storage volume. In response to a signal indicating that the namespace constituent volume is unavailable, the namespace mirror volume replaces the role of the namespace constituent volume. The new namespace constituent volume continues to provide metadata for a data object of the data objects in response to an operation request for the data object.

    Dynamic protocol selection
    340.
    发明授权

    公开(公告)号:US09674312B2

    公开(公告)日:2017-06-06

    申请号:US13930709

    申请日:2013-06-28

    Applicant: NetApp Inc.

    CPC classification number: H04L69/08

    Abstract: Dynamic selection of a protocol for communication between devices is disclosed. A first device may be connected to a second device by one or more communication links, such as a first communication link and a second communication link. Because the first device and the second device may not have pre-existing knowledge of what protocols are supported by the other device, the first device and the second device may perform protocol discovery by attempting protocols on the communication links in a coordinated manner. In this way, if a communication link becomes active between the first device and the second device, then a protocol attempted on the communication link may be supported by the first device and the second device, and thus may be used across the communication links. If multiple protocols are supported, then a preferred protocol is used across the communication links.

Patent Agency Ranking