Abstract:
Variable length records can be accessed from an array of N+2 synchronous fixed block formatted DASDs in a single pass and in the presence of a single DASD failure if each record is partitioned into a variable number of K fixed length blocks, the blocks are written on the DASDs in column major order K modulo (N+1), the order is constrained such that the first block of each record resides on the (N+l)st DASD, a parity block for each column resides on an (N+2)nd DASD, and each parity block spans N blocks in the same column from the first N DASDs and one block one column offset thereto on the (N+1)st DASD.
Abstract:
A method and system for minimizing seek affinity and enhancing write sensitivity in a direct access storage device (DASD) array are disclosed. SEEK affinity and WRITE efficiency are preserved in which logical cylinders, as recorded on the DASD array, are managed as individual log structured files (LSF). Tracks or segments of data and parity blocks having the same or different parity group affinity and stored on the same or different DASD cylindrical addresses are written into a directory managed buffer. Blocks having the same parity affinity and written to counterpart cylinders are written out from the buffer to spare space reserved as part of each DASD cylinder. Otherwise, blocks are updated in place in their DASD array location.
Abstract:
The data contents of up to two concurrently failed or erased DASDs can be reconstituted where the data is distributed across M DASDs as an (M-1)*M block array and where (1) the (M-1)st DASD contains the simple parity taken over each of the array diagonals in diagonal major order in the same mode (odd/even) as that exhibited by the major diagonal of the array and (2) where the M-th DASD contains the simple even parity over each of the rows in row major order. Relatedly, short write updates require fewer operations for data blocks located off the major data array diagonal.
Abstract:
A computing system includes plural nodes that are connected by a communications network. Each node comprises a communications interface that enables an exchange of messages with other nodes. A ready queue is maintained in a node and includes plural message entries, each message entry indicating an output message control data structure. The node further includes memory for storing plural output message control data structures, each including one or more chained further monrtol data structures that define data comprising a message or a portion of a message that is to be dispatched. Control data structures that are chained from an output messsage control data structure exhibit a sequence dependincy. A processor is controlled by the ready queue and enables dispatch of portions of the message designated by an output message control data structure and associated further control structures. The processor prevents dispatch of one portion of a message prior to dispatch of another portion of the message upon which the first portion is dependent even if message transmissions are interrupted.
Abstract:
A method and means for managing access to a logical track of KN blocks of which K are parity blocks. The KN blocks are distributed and stored in an array of N DASDs having K blocks per physical track per DASD. The array includes control means for securing synchronous access to selectable ones of the DASDs responsive to each access request. The method involves (a) formatting the blocks onto the array using a row major order modulus as the metric for balancing the data rate and concurrency (the number of DASDs bound per access) and (b) executing the random sequences of large and small access requests over the array.
Abstract:
A controller for a disk array with parity and sparing includes a non-volatile cache memory and optimizes the destaging process for blocks from the cache memory to both maximize the cache hit ratio and minimize disk utilization. The invention provides a method for organizing the disk array into segments and dividing the cache memory into groups in order of least recently used memory locations and then determining metrics that permit the disk array controller to identify the cache memory locations having the most dirty blocks by segment and group and to identify the utilization rates of the disks. These characteristics are considered to determine when, what, and how to destage. For example, in terms of maximizing the cache hit ratio, when the percentage of dirty blocks in a particular group of the cache memory locations reaches a predetermined level, destaging is begun. The destaging operation continues until the percentage of dirty blocks decreases to a predetermined level. In terms of minimizing disk utilization, all of the dirty blocks in a segment having the most dirty blocks in a group are destaged.
Abstract:
A method for operating a synchronized array of fixed block (FBA) formatted Direct Access Storage Devices (DASDs) to store and update variable-length (CKD) formatted records. This method is suitable for use with DASDs that obtain high recording density by using read and write head technology requiring "micro-jogging" to adjust for differing read and write head alignment or banded disk architecture having a higher block count in the outer tracks than in the inner tracks. Magneto-resistive heads may require micro-jogging to realign the write head for recording after reading the physical track location. The invention employs a DASD staggered array architecture having logical tracks consisting of diagonal-major sequences of consecutive blocks arranged in a predetermined wrap-around manner such as a topological cylinder or torus. The minimum necessary number of DASDs (N) in the staggered array is limited by the fixed block size (B), the interblock gap size (G), the average DASD data transfer rate (D), and the micro-jog delay time (T). A (N+1).sup.th DASD may be added to record the parity of each diagonal-major sequence for improved fault-tolerance.
Abstract:
A method and apparatus teaching insertion of addressing indirection to form and to access an array hierarchy expressly permitting the concurrency of a high level RAID array, the bandwidth and degraded mode operation sustainable by a lower level RAID array, and after a DASD failure minimum spanning involvement when the array is rebuilding and rewriting missing data to a spare logical device. Also, disclosed are the accessing of variable length records on the array hierarchy; array hierarchy in which RAID 5 arrays have dissimilar number of logic devices (lower level RAID arrays) and interleave depths; formation of logical arrays using fractional storage defined onto real DASD subsets; and the defining of logical devices onto DASDs distributed in the same or different physical clusters of DASDs and the rebuild operation thereof.
Abstract:
A method and means for encoding data written onto an array of M synchronous DASDs and for rebuilding onto spare DASD array capacity when up to two array DASD fail. Data is mapped into the DASD array using an (M-1)*M data array as the storage model where M is a prime number. Pairs of simple parities are recursively encoded over data in respective diagonal major and intersecting row major order array directions. The encoding traverse covering a topologically cylindrical path. Rebuilding data upon unavailability of no more than two DASDs merely requires accessing the data array and repeating the encoding step where the diagonals are oppositely sloped and writing the rebuilt array back to onto M DASDs inclusive of the spare capacity.
Abstract:
A system and method are provided that is used by software implemented Redundancy Array of Inexpensive Disk (RAID) arrays to achieve adequate performance and reliability, as well as to improve performance or low cost hardware Raids. The enhancements to the basic RAID implementation speeds up recovery time for software RAIDS. A method is provided for storing data in an array of storage devices. A plurality of block locations on the storage devices are logically arranged as a parity group wherein a parity block stored in a block location as part of a parity group is logically derived from the combination of data blocks stored in the parity group, and each block in a parity group is stored on a different storage device. A plurality of parity groups are grouped into a parity group set. A request is received to write a new data block location on a storage device. The old data block stored at the block location is read. The new data block is written to the block location. When the parity set is in an unmodified state prior to the current write, an indicator is written to the storage device that the parity group set is in a modified state. In a preferred embodiment, this enhancement uses a bit map stored on disk, called Parity Group Set, (PGS) bit map, to mark inconsistent parity groups, replacing the Non-Volatile Random Access Memory, (NVRAM) used for similar purposes by hardware RAIDs. Further enhancements optimized sequential input/output, (I/O) data stream.