Abstract:
A host machine may host a virtual machine. Virtual machine reboot information, used to reboot the virtual machine in the event of a failure or restart of the virtual machine, may be identified (e.g., file system metadata buffers, a virtual non-volatile random access memory log, user data buffers, and/or data used to reboot the virtual machine such as to perform a reboot mounting operation and/or a reboot replay operation of a volume of data associated with the virtual machine). The virtual machine reboot information may be cached within relatively fast host memory of the host machine (e.g., instead of merely within a relatively slower hard drive or other storage device). In this way, the cached virtual machine reboot information may be quickly retrieved so that the virtual machine may be rebooted in a relatively shorter amount of time.
Abstract:
A method, non-transitory computer readable medium, and storage controller computing device that establishes an application interface and a source interface to a programmable switch. A flow table of the programmable switch is updated to insert routing actions associated with the application and source interfaces. Next, when an application request received from an application is locally serviceable is determined. When the determination indicates the application request is not locally serviceable, a migration request for data associated with the application request is sent to the programmable switch from the source interface and a destination address of a source storage server is used. Additionally, a migration response to the migration request including the data from the source storage server is received from the source interface. The data is then stored locally in a destination storage server and thereby is migrated from the source storage server.
Abstract:
Technology is disclosed for using a cache cluster of a cloud computing service (“cloud”) as a victim cache for a data storage appliance (“appliance”) implemented in the cloud. The cloud includes a cache cluster that acts as a primary cache for caching data of various services implemented in the cloud. By using the cache cluster as a victim cache for the appliance, the read throughput of the appliance is improved. The data blocks evicted from a primary cache of the appliance are stored in the cache cluster. These evicted data blocks are likely to be requested again, so storing them in the cache cluster can increase performance, e.g., input-output (I/O) throughput of the appliance. A read request for data can be serviced by retrieving the data from the cache cluster instead of a persistent storage medium of the appliance, which has higher read latency than the cache cluster.
Abstract:
A live non-volatile (NV) replay technique enables a partner node to efficiently takeover a failed node of a high-availability pair in a multi-node storage cluster by dynamically replaying operations synchronously logged in a non-volatile random access memory (NVRAM) of the partner node, while also providing high performance during normal operation. Dynamic live replay may be effected through interpretation of metadata describing the logged operations. The metadata may specify a location and type of each logged operation within a partner portion of the NVRAM, as well as any dependency among the logged operation and any other logged operations that would impose an ordering constraint. During normal operation, the partner node may consult the metadata to identify dependent logged operations and dynamically replay those operations to satisfy one or more requests. Upon failure of the node, the partner node may replay, in parallel, those logged operations having no imposed ordering constraint, thereby reducing time needed to complete takeover of the failed node.
Abstract:
A distributed control protocol dynamically establishes high availability (HA) partner relationships for nodes in a cluster. A HA partner relationship may be established by copying (mirroring) information maintained in a non-volatile random access memory (NVRAM) of a node over a HA interconnect to the NVRAM of a partner node in the cluster. The distributed control protocol leverages a Cluster Liveliness and Availability Manager (CLAM) utility of a storage operating system executing on the nodes to rebalance NVRAM mirroring and alter HA partner relationships of the nodes in the cluster. The CLAM utility is configured to maintain various cluster related issues, such as CLAM quorum events, addition or subtraction of a node in the cluster and other changes in configuration of the cluster. Notably, the CLAM utility is an event based manager that implements the control protocol to keep the nodes informed of any cluster changes through event generation and propagation.
Abstract:
Systems and methods for increasing high availability of data in a multi-node storage network are provided. Aspects may include allocating data and mirrored data associated with nodes in the storage network to storage units associated with the nodes. Upon identifying additional nodes added to the storage network, data and mirrored data associated with the nodes may be dynamically reallocated to the storage units. Systems and methods for high availability takeover in a high availability multi-node storage network are also provided. Aspects may include detecting a fault associated with a node in the storage network, and initiating a takeover routine in response to detecting the fault. The takeover routine may be implemented to reallocate data and mirrored data associated with the nodes in the storage network among the operable nodes and associated storage units.