Abstract:
Variable length records can be accessed from an array of N+2 synchronous fixed block formatted DASDs in a single pass and in the presence of a single DASD failure if each record is partitioned into a variable number of K fixed length blocks, the blocks are written on the DASDs in column major order K modulo (N+1), the order is constrained such that the first block of each record resides on the (N+l)st DASD, a parity block for each column resides on an (N+2)nd DASD, and each parity block spans N blocks in the same column from the first N DASDs and one block one column offset thereto on the (N+1)st DASD.
Abstract:
A method and system for minimizing seek affinity and enhancing write sensitivity in a direct access storage device (DASD) array are disclosed. SEEK affinity and WRITE efficiency are preserved in which logical cylinders, as recorded on the DASD array, are managed as individual log structured files (LSF). Tracks or segments of data and parity blocks having the same or different parity group affinity and stored on the same or different DASD cylindrical addresses are written into a directory managed buffer. Blocks having the same parity affinity and written to counterpart cylinders are written out from the buffer to spare space reserved as part of each DASD cylinder. Otherwise, blocks are updated in place in their DASD array location.
Abstract:
The data contents of up to two concurrently failed or erased DASDs can be reconstituted where the data is distributed across M DASDs as an (M-1)*M block array and where (1) the (M-1)st DASD contains the simple parity taken over each of the array diagonals in diagonal major order in the same mode (odd/even) as that exhibited by the major diagonal of the array and (2) where the M-th DASD contains the simple even parity over each of the rows in row major order. Relatedly, short write updates require fewer operations for data blocks located off the major data array diagonal.
Abstract:
A computing system includes plural nodes that are connected by a communications network. Each node comprises a communications interface that enables an exchange of messages with other nodes. A ready queue is maintained in a node and includes plural message entries, each message entry indicating an output message control data structure. The node further includes memory for storing plural output message control data structures, each including one or more chained further monrtol data structures that define data comprising a message or a portion of a message that is to be dispatched. Control data structures that are chained from an output messsage control data structure exhibit a sequence dependincy. A processor is controlled by the ready queue and enables dispatch of portions of the message designated by an output message control data structure and associated further control structures. The processor prevents dispatch of one portion of a message prior to dispatch of another portion of the message upon which the first portion is dependent even if message transmissions are interrupted.
Abstract:
A method and means for managing access to a logical track of KN blocks of which K are parity blocks. The KN blocks are distributed and stored in an array of N DASDs having K blocks per physical track per DASD. The array includes control means for securing synchronous access to selectable ones of the DASDs responsive to each access request. The method involves (a) formatting the blocks onto the array using a row major order modulus as the metric for balancing the data rate and concurrency (the number of DASDs bound per access) and (b) executing the random sequences of large and small access requests over the array.
Abstract:
A system and method are provided that is used by software implemented Redundancy Array of Inexpensive Disk (RAID) arrays to achieve adequate performance and reliability, as well as to improve performance or low cost hardware Raids. The enhancements to the basic RAID implementation speeds up recovery time for software RAIDS. A method is provided for storing data in an array of storage devices. A plurality of block locations on the storage devices are logically arranged as a parity group wherein a parity block stored in a block location as part of a parity group is logically derived from the combination of data blocks stored in the parity group, and each block in a parity group is stored on a different storage device. A plurality of parity groups are grouped into a parity group set. A request is received to write a new data block location on a storage device. The old data block stored at the block location is read. The new data block is written to the block location. When the parity set is in an unmodified state prior to the current write, an indicator is written to the storage device that the parity group set is in a modified state. In a preferred embodiment, this enhancement uses a bit map stored on disk, called Parity Group Set, (PGS) bit map, to mark inconsistent parity groups, replacing the Non-Volatile Random Access Memory, (NVRAM) used for similar purposes by hardware RAIDs. Further enhancements optimized sequential input/output, (I/O) data stream.
Abstract:
Seek affinity is preserved in a segment oriented, cached, log structured array (LSA) of DASDs responsive to accesses dominated by sequential read and random writes of logical tracks stored in the segments. This is achieved by collecting all the write modified read active tracks and clean read active tracks either destaged from the cache or garbage collected from the LSA and rewriting them out to the LSA as segments into regions of contiguous segments of read active tracks. Also, all write modified read inactive tracks and clean read inactive tracks either destaged from cache or garbage collected from the LSA are collected and rewritten out to the LSA as segments into regions of contiguous segments of read inactive tracks. Garbage collection is initiated when the detected free space in a region falls below a threshold and continues until the collected segments exceed a second threshold. Alternatively, write age of logical tracks may be used instead of read activity so as to cluster LSA DASDs into a region of segments formed from old write active logical tracks and a region of current write active logical tracks.
Abstract:
Data regions, parity regions, and spare regions in a redundant array of storage units are distributed such that each storage unit in the array has the same number of parity regions before, during, and after a failure of one or more storage units such that there is a uniform workload distribution among the storage units during normal operation, during the rebuild process, and during operation after the rebuild and before repair or replacement of the failed unit. The array provides uniform workload distribution for one or more failures of storage units. The number of storage units and the number of storage regions per storage unit is specified once the number of regions in a parity group and the number of failures to be managed are specified. The data regions, parity regions, and spare regions are then placed to provide the uniform workload distribution. Data, parity, and spare regions also can be distributed across multiple redundant arrays to provide the failure-tolerant advantages with uniform workload distribution.
Abstract:
A method for managing cache accessing of CKD formatted records that uses a Predictive Track Table to reduce host delays resulting from cache write misses. Because a significant portion of CKD formatted DASD tracks contain records having no key fields, identical logical and physical cylinder and head (CCHH) fields and similar-sized data fields, a compact description of such records by record count and length data, indexed by track, can be quickly searched to determine the physical track location of a record update that misses the cache. The Predictive Track Table search is much faster than the host wait state imposed by access and search of the DASD to read the missing track into cache. If the updated record that misses cache is found within the set of records in the Predictive Track Table, then the update may be immediately written to cache and to a Non-Volatile Store (NVS) without a DASD read access. This update then may be later destaged asynchronously to the DASD from either the cache or the NVS. Otherwise, if not found in a predictive track, the update record is written directly to the disk and the cache, subject to the LRU/MRU discipline, incurring the normal cache write-miss host wait state.
Abstract:
One or more data-storing disk devices support logical tracks extending between radial recording zones of tracks in the device(s). Each data-storing disk in the device(s) is formatted into a plurality of radial recording zones of physical tracks, each radial recording zone having a like number of physical tracks, each physical track may be one circumvolution of a single spiral track. The physical tracks in the respective recording zones store a different number of data bytes. Each logical track including a plurality of said physical tracks; at least one of the physical tracks in each of the logical tracks is in a different one of the radial recording zones in different ones of the devices or in a single device. Described are an extended logical track and extended logical cylinder accessing methods and apparatus. Not all of the physical tracks of any of the devices or recording zones need be a member of any logical track.