Abstract:
Example apparatus and methods identify files that are so small or so large that they compromise the efficient operation of a file system that uses re-assignable one-to-one inodes and inode numbers. Small files are aggregated into collections of files and large files are subdivided into collections of smaller files. Information for locating multiple related files with fewer lookups is generated and stored in a folder. An inode having a new type of inode number is then created. The new type of inode number encodes information for finding the folder. The encoded information may include a folder identifier that acts as a primary key into a database that is configured to locate a member of the aggregated or subdivided files with a single lookup. A filter file system may be updated with the new inode. The new inode number is unique within the filter file system and may not be re-assigned.
Abstract:
Storage conditioning for a data storage system having D data storage devices (DSDs) is provided. E erasure codes (ECs) for an object are stored in the system, D>E. A map of d E-sized vectors of the D DSDs is produced. A DSD appears in e vectors. The ratio d/e is the reduced form of D/E. A hash value is produced for the object. A destination vector for storing the ECs is selected using the hash value according to a pre-determined, substantially uniform distribution. A compromised vector affected by a first DSD becoming unable to store ECs is identified. An intact vector that is not affected by the first DSD is identified. A complete set of ECs is produced from an incomplete set of ECs in the compromised vector and is distributed to the intact vector then copied back when the compromised vector is once again intact.
Abstract:
Example apparatus and methods selectively replicate some erasure codes associated with a message and selectively distribute, without replicating, other erasure codes associated with the message. The message may have k symbols and n erasure codes may have been generated for the message, n>=k. In one embodiment, erasure codes that store plaintext information from the message (e.g., un-encoded symbols) may be replicated (e.g., sent to all devices using erasure codes associated with the message) while erasure codes that do not store plaintext information may be distributed (e.g., selectively moved to less than all devices) without being replicated. Some (e.g., less than k) erasure codes that do not store plaintext information may be stored unencrypted in the cloud. The generator matrix will not be stored in the cloud.
Abstract:
Methods, apparatus, and other embodiments associated with doubly distributing erasure encoded data in a data storage system are described. One example apparatus includes a set of data storage devices and a set of logics that includes an encoding logic that generates an erasure encoded object that includes code-words, and chunks the code-words into code-word chunks, and a distribution logic that interleaves members of the set of code-word chunks into a plurality of records, and distributes the records across the data storage devices and within individual data storage devices. Example apparatus may include a read logic that reads the plurality of stored records from the data storage devices, and ignores read errors, and a repair logic that monitors the set of data storage devices, replaces or repairs failing data storage devices, generates replacement records, and stores the replacement records on a replacement data storage device.
Abstract:
Example apparatus and methods provide improved reclamation, garbage collection (GC) and defragmentation (defrag) for data storage devices including solid state drives (SSD) or shingled magnetic recording (SMR) drives. An erasure code (EC) layer that facilitates logically or physically erasing data from the SSD or SMR as a comprehensive GC or defrag is added to the SSD or SMR. Erased data may be selectively recreated from the EC layer as needed. Pre-planned EC write zones may be established to further optimize GC and defrag. Recreated data may be written to selected locations to further optimize SSD and SMR performance. Erasure code data may be distributed to co-operating devices to further improve GC or defrag. Example apparatus and methods may also facilitate writing data to an SMR drive using tape or VTL applications or processes and providing a pseudo virtual tape library on the SMR drive.
Abstract:
Example apparatus and methods selectively replicate some erasure codes associated with a message and selectively distribute, without replicating, other erasure codes associated with the message. The message may have k symbols and n erasure codes may have been generated for the message, n>=k. In one embodiment, erasure codes that store plaintext information from the message (e.g., un-encoded symbols) may be replicated (e.g., sent to all devices using erasure codes associated with the message) while erasure codes that do not store plaintext information may be distributed (e.g., selectively moved to less than all devices) without being replicated. Some (e.g., less than k) erasure codes that do not store plaintext information may be stored unencrypted in the cloud. The generator matrix will not be stored in the cloud.
Abstract:
Example apparatus and methods treat some erasure codes differently than other erasure codes. For example, erasure codes that are only involved in error-recovery may never be read and thus may be stored using a different approach than erasure codes that are involved in more regular data reading. If different types of data stores are available, then the erasure codes that are more likely to be read may be stored in data stores having a first (e.g., higher, faster) type of read performance while the erasure codes that are less likely to be read may be stored in data stores having a second (e.g., lower, slower, less expensive) type of read performance. Different data stores may be located on different data storage devices. Different data stores may even be located on a single data storage device.
Abstract:
Embodiments include a data aware deduplicating object store. The data aware deduplicating data store includes a consistent hashing logic that manages a consistent hashing architecture for the object store. The consistent hashing architecture includes a metadata ring and a bulk ring. The consistent hashing architecture may be a multiple ring architecture comprising a metadata ring and two or more bulk rings. A bulk ring may include a key/value (k/v) data store, where a k/v data store stores a shard of an index and a reference count that facilitates the individual approach to garbage collection or data reclamation. The data aware deduplicating data store also includes a deduplication logic that provides data deduplication for data to be stored in the object store. The deduplication logic performs variable length deduplication and provides a shared nothing approach.
Abstract:
Adaptive pre-fetching devices can predict data placement to improve the operating and/or electrical efficiency of a data storage system. A future input/output operation can be predicted from a current input/output operation, the state of the data storage apparatus, relationships between data currently being processed and data previously processed, or other factors. The apparatus and methods can improve data storage efficiency by selectively pre-fetching data, relocating data on the data storage apparatus, the backing storage, or within a plurality of data storage apparatus based on working set predictors to reduce cache misses or outperform fetch processes from the backing storage.
Abstract:
Example methods and apparatus asynchronously verify data stored in a cloud data storage system. One embodiment comprises a monitoring circuit that determines if a data auditing condition associated with a cloud storage system or archived data stored in the cloud storage system has been met, a metadata mirror circuit that controls a metadata mirror to provide metadata, including a first checksum, associated with the archived data to the apparatus, a checksum circuit that computes a second checksum based on the archived data, a verification circuit that generates an audit of the first checksum and the second checksum by comparing the second checksum with the first checksum, and a reporting circuit that generates a log of the audit, that provides the log to the data storage system, and that provides a notification of a data integrity failure to a user associated with the archived data.