Abstract:
Various embodiments are generally directed an apparatus and method for receiving information to write on a clustered system comprising at least a first cluster and a second cluster, determining that a failure event has occurred on the clustered system creating unsynchronized information, the unsynchronized information comprising at least one of inflight information and dirty region information, and performing a resynchronization operation to synchronize the unsynchronized information on the first cluster and the second cluster based on log information in at least one of an inflight tracker log for the inflight information and a dirty region log for the dirty region information.
Abstract:
Described herein are method and apparatus for servicing software components of nodes of a cluster storage system. During data-access sessions with clients, client IDs and file handles for accessing files are produced and stored to clients and stored (as session data) to each node. A serviced node is taken offline, whereby network connections to clients are disconnected. Each disconnected client is configured to retain its client ID and file handles and attempt reconnections. Session data of the serviced node is made available to a partner node (by transferring session data to the partner node). After clients have reconnected to the partner node, the clients may use the retained client IDs and file handles to continue a data-access session with the partner node since the partner node has access to the session data of the serviced node and thus will recognize and accept the retained client ID and file handles.
Abstract:
Various embodiments are generally directed to techniques for handling errors affecting the at least partially parallel performance of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node, an access component to perform a command received from a client device via a network to alter client device data stored in a first storage device coupled to the first node, a replication component to transmit a replica of the command to a second node via the network to enable performance of the replica by the second node at least partially in parallel, an error component to retry transmission of the replica based on a failure indicated by the second node and a status component to select a status indication to transmit to the client device based on the indication of failure and results of retrial of transmission of the replica.
Abstract:
Various embodiments are generally directed to an apparatus and method to receive client traffic comprising information at a primary cluster of a clustered system over a communications link, perform, a replication operation on the clustered system to replicate the information on a secondary cluster of the clustered system, and determine a client traffic throughput for the client traffic and a replication throughput for the replication operation. In some embodiments, the apparatus and method may include buffering one or more write operations to control the client traffic such that the client traffic throughput is less than or equal to the replication throughput for the replication operation.
Abstract:
Described herein are method and apparatus for servicing software components of nodes of a cluster storage system. During data-access sessions with clients, client IDs and file handles for accessing files are produced and stored to clients and stored (as session data) to each node. A serviced node is taken offline, whereby network connections to clients are disconnected. Each disconnected client is configured to retain its client ID and file handles and attempt reconnections. Session data of the serviced node is made available to a partner node (by transferring session data to the partner node). After clients have reconnected to the partner node, the clients may use the retained client IDs and file handles to continue a data-access session with the partner node since the partner node has access to the session data of the serviced node and thus will recognize and accept the retained client ID and file handles.
Abstract:
Various embodiments are generally directed to techniques for handling errors affecting the at least partially parallel performance of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node, an access component to perform a command received from a client device via a network to alter client device data stored in a first storage device coupled to the first node, a replication component to transmit a replica of the command to a second node via the network to enable performance of the replica by the second node at least partially in parallel, an error component to retry transmission of the replica based on a failure indicated by the second node and a status component to select a status indication to transmit to the client device based on the indication of failure and results of retrial of transmission of the replica.