摘要:
A computer-implemented method for shipping I/O operations to prevent replication failure may include 1) attempting to perform an I/O operation in a system configured to replicate data from a data cluster to another data cluster, 2) detecting a failure in at least part of the attempt to perform the I/O operation that threatens to fail the system's replication of data from the data cluster to the other data cluster, and, in response to detecting the failure, 3) shipping the I/O operation from a node originally responsible for servicing the I/O operation to another node to complete the I/O operation without failing the system's replication of data from the data cluster to the other data cluster. Various other methods, systems, and computer-readable media are also disclosed.
摘要:
A computer-implemented method for natural batching of I/O operations on a replication log may include: 1) identifying a replication log that records the order of writes within a cluster replication system, 2) determining that the replication log is unavailable, 3) queuing incoming I/O operations for the replication log in a single batch while the replication log is unavailable, 4) determining that the replication log has become available, 5) ceasing queuing of incoming I/O operations for the replication log based on the determination that the replication log has become available, and 6) grouping a plurality of I/O operations in the single batch for processing in parallel by assigning a same generation number to the plurality of I/O operations. Various other methods, systems, and computer-readable media are also disclosed.
摘要:
A method for storage reclamation in a shared storage device. The method includes executing a distributed computer system having a plurality of file systems accessing storage on a shared storage device, and initiating a reclamation operation by using a reclamation agent that accesses the shared storage device. The method further includes reading the file system data structure that represent unallocated storage blocks of one of the plurality of file systems that will undergo a reclamation operation. A plurality of I/O resources that are used to provide I/O to the unallocated storage blocks are then interrupted. Storage from the unallocated storage blocks is then reclaimed, and normal operation of the I/O resources that are used to provide I/O to the unallocated storage blocks is resumed.
摘要:
A method for fast I/O path failure detection and cluster wide failover. The method includes accessing a distributed computer system having a cluster including a plurality of nodes, and experiencing an I/O path failure for a storage device. An I/O failure message is generated in response to the I/O path failure. A cluster wide I/O failure message broadcast to the plurality of nodes that designates a faulted controller. Upon receiving I/O failure responses from the plurality of nodes, an I/O queue message is broadcast to the nodes to cause the nodes to queue I/O through the faulted controller and switch to an alternate controller. Upon receiving I/O queue responses from the plurality of nodes, an I/O failover commit message is broadcast to the nodes to cause the nodes to commit to a failover and un-queue their I/O.
摘要:
A computer-implemented method for reclaiming storage space from deleted volumes on thin-provisioned disks may include: 1) identifying a deleted volume, 2) identifying storage space on a thin-provisioned disk that was allocated to the deleted volume, 3) saving information that identifies the storage space, 4) identifying a policy that specifies reclaiming the storage space asynchronously with respect to the deleted volume, and then 5) reclaiming the storage space asynchronously with respect to deletion of the volume in accordance with the policy. Various other methods, systems, and computer-readable media are also disclosed.
摘要:
Storage systems and methods are presented. In one embodiment, a data storage resource management method comprises: performing a data update process, including communicating a data update input output packet between a primary storage resource and a secondary storage resource, wherein corresponding data updates in the secondary storage resource are a mirror of data updates in the primary storage resource; and performing a reclamation process, including: communicating reclamation information in a reclamation input output packet through the same interface as the data update input output packet, wherein the reclamation input output packet is communicated between the primary storage resource and the secondary storage resource; and reclaiming storage locations on the secondary storage resource in accordance with reclamation information in the reclamation input output packet communicated between the primary storage resource and secondary storage resource.
摘要:
Storage systems and methods are presented. In one embodiment, a data storage resource management method comprises: performing a data update process, including communicating a data update input output packet between a primary storage resource and a secondary storage resource, wherein corresponding data updates in the secondary storage resource are a mirror of data updates in the primary storage resource; and performing a reclamation process, including: communicating reclamation information in a reclamation input output packet through the same interface as the data update input output packet, wherein the reclamation input output packet is communicated between the primary storage resource and the secondary storage resource; and reclaiming storage locations on the secondary storage resource in accordance with reclamation information in the reclamation input output packet communicated between the primary storage resource and secondary storage resource.
摘要:
A method for fast I/O path failure detection and cluster wide failover. The method includes accessing a distributed computer system having a cluster including a plurality of nodes, and experiencing an I/O path failure for a storage device. An I/O failure message is generated in response to the I/O path failure. A cluster wide I/O failure message broadcast to the plurality of nodes that designates a faulted controller. Upon receiving I/O failure responses from the plurality nodes, an I/O queue message is broadcast to the nodes to cause the nodes to queue I/O through the faulted controller and switch to an alternate controller. Upon receiving I/O queue responses from the plurality nodes, an I/O failover commit message is broadcast to the nodes to cause the nodes to commit to a failure and un-queue their I/O.
摘要:
A computer-implemented method for reclaiming storage space on striped volumes may include: 1) identifying a volume striped across a set of storage devices, 2) identifying a reclamation request to reclaim storage space allocated to the striped volume and then, for at least one device in the set of storage devices, 3) identifying stripes of storage on the device that are covered by the reclamation request, 4) creating a consolidated reclamation request for the device that identifies each stripe of storage on the device that is covered by the reclamation request, and then 5) issuing the consolidated reclamation request to the device. Various other methods, systems, and computer-readable media are also disclosed.
摘要:
A method for storage reclamation in a shared storage device. The method includes executing a distributed computer system having a plurality of file systems accessing storage on a shared storage device, and initiating a reclamation operation by using a reclamation agent that accesses the shared storage device. The method further includes reading the file system data structure that represent unallocated storage blocks of one of the plurality of file systems that will undergo a reclamation operation. A plurality of I/O resources that are used to provide I/O to the unallocated storage blocks are then interrupted. Storage from the unallocated storage blocks is then reclaimed, and normal operation of the I/O resources that are used to provide I/O to the unallocated storage blocks is resumed.