Abstract:
A method for adjustable error correction in a storage cluster is provided. The method includes determining health of a non-volatile memory of a non-volatile solid-state storage unit of each of a plurality of storage nodes in a storage cluster on a basis of per flash package, per flash die, per flash plane, per flash block, or per flash page. The determining is performed by the storage cluster. The plurality of storage nodes is housed within a chassis that couples the storage nodes as the storage cluster. The method includes adjusting erasure coding across the plurality of storage nodes based on the health of the non-volatile memory and distributing user data throughout the plurality of storage nodes through the erasure coding. The user data is accessible via the erasure coding from a remainder of the plurality of storage nodes if any of the plurality of storage nodes are unreachable.
Abstract:
In some embodiments, a method for die-level monitoring is provided. The method includes distributing user data throughout a plurality of storage nodes through erasure coding, wherein the plurality of storage nodes are housed within a chassis that couples the storage nodes. Each of the storage nodes has a non-volatile solid-state storage with non-volatile memory and the user data is accessible via the erasure coding from a remainder of the storage nodes in event of two of the storage nodes being unreachable. The method includes producing diagnostic information that diagnoses the non-volatile memory on a basis of per package, per die, per plane, per block, or per page, the producing performed by each of the plurality of storage nodes. The method includes writing the diagnostic information to a memory in the storage cluster.
Abstract:
In some embodiments, a method for die-level monitoring is provided. The method includes distributing user data throughout a plurality of storage nodes through erasure coding, wherein the plurality of storage nodes are housed within a chassis that couples the storage nodes. Each of the storage nodes has a non-volatile solid-state storage with non-volatile memory and the user data is accessible via the erasure coding from a remainder of the storage nodes in event of two of the storage nodes being unreachable. The method includes producing diagnostic information that diagnoses the non-volatile memory on a basis of per package, per die, per plane, per block, or per page, the producing performed by each of the plurality of storage nodes. The method includes writing the diagnostic information to a memory in the storage cluster.
Abstract:
A method for erasure detection in a storage cluster is provided. The method includes establishing a connection, via a network, of a storage unit to one of a plurality of storage nodes of a storage cluster and determining, for at least one page of a storage memory of the storage unit, that the at least one page is erased. The storage unit is one of a plurality of storage units configured to store user data in memory of the storage units in accordance with direction from the plurality of storage nodes. The method includes communicating from the storage unit to the one of the plurality of storage nodes that the at least one page is erased.
Abstract:
A storage cluster is provided. The storage cluster includes a plurality of storage nodes, each of the plurality of storage nodes having nonvolatile solid-state memory and a plurality of operations queues coupled to the solid-state memory. The plurality of storage nodes is configured to distribute the user data and metadata throughout the plurality of storage nodes such that the plurality of storage nodes can access the user data with a failure of two of the plurality of storage nodes. Each of the plurality of storage nodes is configured to determine whether a read of 1 or more bits in the solid-state memory via a first path is within a latency budget. The plurality of storage nodes is configured to perform a read of user data or metadata via a second path, responsive to a determination that the read of the bit via the first path is not within the latency budget.
Abstract:
A method for adjustable error correction in a storage cluster is provided. The method includes determining health of a non-volatile memory of a non-volatile solid-state storage unit of each of a plurality of storage nodes in a storage cluster on a basis of per flash package, per flash die, per flash plane, per flash block, or per flash page. The determining is performed by the storage cluster. The plurality of storage nodes is housed within a chassis that couples the storage nodes as the storage cluster. The method includes adjusting erasure coding across the plurality of storage nodes based on the health of the non-volatile memory and distributing user data throughout the plurality of storage nodes through the erasure coding. The user data is accessible via the erasure coding from a remainder of the plurality of storage nodes if any of the plurality of storage nodes are unreachable.
Abstract:
A plurality of storage nodes within a single chassis is provided. The plurality of storage nodes is configured to communicate together as a storage cluster. The plurality of storage nodes has a non-volatile solid-state storage for user data storage. The plurality of storage nodes is configured to distribute the user data and metadata associated with the user data throughout the plurality of storage nodes, with erasure coding of the user data. The plurality of storage nodes is configured to recover from failure of two of the plurality of storage nodes by applying the erasure coding to the user data from a remainder of the plurality of storage nodes. The plurality of storage nodes is configured to detect an error and engage in an error recovery via one of a processor of one of the plurality of storage nodes, a processor of the non-volatile solid state storage, or the flash memory.
Abstract:
In some embodiments, a method for die-level monitoring is provided. The method includes distributing user data throughout a plurality of storage nodes through erasure coding, wherein the plurality of storage nodes are housed within a chassis that couples the storage nodes. Each of the storage nodes has a non-volatile solid-state storage with non-volatile memory and the user data is accessible via the erasure coding from a remainder of the storage nodes in event of two of the storage nodes being unreachable. The method includes producing diagnostic information that diagnoses the non-volatile memory on a basis of per package, per die, per plane, per block, or per page, the producing performed by each of the plurality of storage nodes. The method includes writing the diagnostic information to a memory in the storage cluster.
Abstract:
A storage cluster is provided. The storage cluster includes a plurality of storage nodes, each of the plurality of storage nodes having nonvolatile solid-state memory and a plurality of operations queues coupled to the solid-state memory. The plurality of storage nodes is configured to distribute the user data and metadata throughout the plurality of storage nodes such that the plurality of storage nodes can access the user data with a failure of two of the plurality of storage nodes. Each of the plurality of storage nodes is configured to determine whether a read of 1 or more bits in the solid-state memory via a first path is within a latency budget. The plurality of storage nodes is configured to perform a read of user data or metadata via a second path, responsive to a determination that the read of the bit via the first path is not within the latency budget.
Abstract:
A plurality of storage nodes within a single chassis is provided. The plurality of storage nodes is configured to communicate together as a storage cluster. The plurality of storage nodes has a non-volatile solid-state storage for user data storage. The plurality of storage nodes is configured to distribute the user data and metadata associated with the user data throughout the plurality of storage nodes, with erasure coding of the user data. The plurality of storage nodes is configured to recover from failure of two of the plurality of storage nodes by applying the erasure coding to the user data from a remainder of the plurality of storage nodes. The plurality of storage nodes is configured to detect an error and engage in an error recovery via one of a processor of one of the plurality of storage nodes, a processor of the non-volatile solid state storage, or the flash memory.