Abstract:
Data is stored on a non-volatile storage media in a sequential, log-based format. The formatted data defines an ordered sequence of storage operations performed on the non-volatile storage media. A storage layer maintains volatile metadata, which may include a forward index associating logical identifiers with respective physical storage units on the non-volatile storage media. The volatile metadata may be reconstructed from the ordered sequence of storage operations. Persistent notes may be used to maintain consistency between the volatile metadata and the contents of the non-volatile storage media. Persistent notes may identify data that does not need to be retained on the non-volatile storage media and/or is no longer valid.
Abstract:
An apparatus, system, and method are disclosed for caching data on a solid-state storage device. The solid-state storage device maintains metadata pertaining to cache operations performed on the solid-state storage device, as well as storage operations of the solid-state storage device. The metadata indicates what data in the cache is valid, as well as information about what data in the nonvolatile cache has been stored in a backing store. A backup engine works through units in the nonvolatile cache device and backs up the valid data to the backing store. During grooming operations, the groomer determines whether the data is valid and whether the data is discardable. Data that is both valid and discardable may be removed during the grooming operation. The groomer may also determine whether the data is cold in determining whether to remove the data from the cache device. The cache device may present to clients a logical space that is the same size as the backing store. The cache device may be transparent to the clients.
Abstract:
An apparatus, system, and method are disclosed for managing data in a solid-state storage device. A solid-state storage and solid-state controller are included. The solid-state storage controller includes a write data pipeline and a read data pipeline The write data pipeline includes a packetizer and an ECC generator. The packetizer receives a data segment and creates one or more data packets sized for the solid-state storage. The ECC generator generates one or more error-correcting codes (“ECC”) for the data packets received from the packetizer. The read data pipeline includes an ECC correction module, a depacketizer, and an alignment module. The ECC correction module reads a data packet from solid-state storage, determines if a data error exists using corresponding ECC and corrects errors. The depacketizer checks and removes one or more packet headers. The alignment module removes unwanted data, and re-formats the data as data segments of an object.
Abstract:
An apparatus, system, and method are disclosed for data storage with progressive redundant array of independent drives (“RAID”). A storage request receiver module, a striping module, a parity-mirror module, and a parity progression module are included. The storage request receiver module receives a request to store data of a file or of an object. The striping module calculates a stripe pattern for the data. The stripe pattern includes one or more stripes, and each stripe includes a set of N data segments. The striping module writes the N data segments to N storage devices. Each data segment is written to a separate storage device within a set of storage devices assigned to the stripe. The parity-mirror module writes a set of N data segments to one or more parity-mirror storage devices within the set of storage devices. The parity progression module calculates a parity data segment on each parity-mirror device in response to a storage consolidation operation, and stores the parity data segments. The storage consolidation operation is conducted to recover storage space and/or data on a parity-mirror storage device.
Abstract:
Data is stored on a non-volatile storage media in a sequential, log-based format. The formatted data defines an ordered sequence of storage operations performed on the non-volatile storage media. A virtual storage layer maintains volatile metadata, which may include a forward index associating logical identifiers with respective physical storage units on the non-volatile storage media. The volatile metadata may be reconstructed from the ordered sequence of storage operations. Persistent notes may be used to maintain consistency between the volatile metadata and the contents of the non-volatile storage media. Persistent notes may identify data that does not need to be retained on the non-volatile storage media and/or is no longer valid.
Abstract:
An apparatus, system, and method are disclosed for efficiently mapping virtual and physical addresses. A forward mapping module uses a forward map to identify physical addresses of data of a data segment from a virtual address. The data segment is identified in a storage request. The virtual addresses include discrete addresses within a virtual address space where the virtual addresses sparsely populate the virtual address space. A reverse mapping module uses a reverse map to determine a virtual address of a data segment from a physical address. The reverse map maps the data storage device into erase regions such that a portion of the reverse map spans an erase region of the data storage device erased together during a storage space recovery operation. A storage space recovery module uses the reverse map to identify valid data in an erase region prior to an operation to recover the erase region.
Abstract:
An apparatus, system, and method are disclosed for managing data with an empty data segment directive at the storage device. The apparatus, system, and method for managing data include a write request receiver module and a data segment token storage module. The write request receiver module receives a storage request from a requesting device. The storage request includes a request to store a data segment in a storage device. The data segment includes a series of repeated, identical characters or a series of repeated, identical character strings. The data segment token storage module stores a data segment token in the storage device. The data segment token includes at least a data segment identifier and a data segment length. The data segment token is substantially free of data from the data segment.
Abstract:
An apparatus, system, and method are disclosed for bad block remapping. A bad block identifier module identifies one or more data blocks on a solid-state storage element as bad blocks. A log update module writes at least a location of each bad block identified by the bad block identifier module into each of two or more redundant bad block logs. A bad block mapping module accesses at least one bad block log during a start-up operation to create in memory a bad block map. The bad block map includes a mapping between the bad block locations in the bad block log and a corresponding location of a replacement block for each bad block location. Data is stored in each replacement block instead of the corresponding bad block. The bad block mapping module creates the bad block map using one of a replacement block location and a bad block mapping algorithm.
Abstract:
An apparatus, system, and method are disclosed for storage space recovery after reaching a read count limit. A read module reads data in a storage division of solid-state storage. A read counter module then increments a read counter corresponding to the storage division. A read counter limit module determines if the read count exceeds a maximum read threshold, and if so, a storage division selection module selects the corresponding storage division for recovery. A data recovery module reads valid data packets from the selected storage division, stores the valid data packets in another storage division of the solid-state storage, and updates a logical index with a new physical address of the valid data.
Abstract:
An apparatus, system, and method are disclosed for a front-end, distributed redundant array of independent drives (“RAID”). A storage request receiver module receives a storage request to store object or file data in a set of autonomous storage devices forming a RAID group. The storage devices independently receive storage requests from a client over a network, and one or more of the storage devices are designated as parity-mirror storage devices for a stripe. The striping association module calculates a stripe pattern for the data. Each stripe includes N data segments, each associated with N storage devices. The parity-mirror association module associates a set of the N data segments with one or more parity-mirror storage devices. The storage request transmitter module transmits storage requests to each storage device. Each storage request is sufficient to store onto the storage device the associated data segments. The storage requests are substantially free of data.