摘要:
A computing system includes plural nodes that are connected by a communications network. Each node comprises a communications interface that enables an exchange of messages with other nodes. A ready queue is maintained in a node and includes plural message entries, each message entry indicating an output message control data structure. The node further includes memory for storing plural output message control data structures, each including one or more chained further monrtol data structures that define data comprising a message or a portion of a message that is to be dispatched. Control data structures that are chained from an output messsage control data structure exhibit a sequence dependincy. A processor is controlled by the ready queue and enables dispatch of portions of the message designated by an output message control data structure and associated further control structures. The processor prevents dispatch of one portion of a message prior to dispatch of another portion of the message upon which the first portion is dependent even if message transmissions are interrupted.
摘要:
A system that enables pipelining of data to and from a memory includes multiple control block data structures which indicate amounts of data stored in the memory. An input port device receives and stores in memory, data segments of a received data message and only updates status information in the software control blocks when determined quantities of the data segments are stored. An output port is responsive to a request for transmission of a portion of the received data and to a signal from the input port that at least a first control count of data segments of the received data are present in memory. The output port then outputs the stored data segments from memory but discontinues the action if, before the required portion of the received data is outputted, software control blocks indicate that no further stored data segments are available for outputting. The input port then updates the software control blocks when newly arrived and stored data segments reach a second control count value, the updating occurring irrespective of whether the determined quantity of the received data has been stored in memory.
摘要:
The data contents of up to two concurrently failed or erased DASDs can be reconstituted where the data is distributed across M DASDs as an (M-1)*M block array and where (1) the (M-1)st DASD contains the simple parity taken over each of the array diagonals in diagonal major order in the same mode (odd/even) as that exhibited by the major diagonal of the array and (2) where the M-th DASD contains the simple even parity over each of the rows in row major order. Relatedly, short write updates require fewer operations for data blocks located off the major data array diagonal.
摘要:
A method and means for managing access to a logical track of KN blocks of which K are parity blocks. The KN blocks are distributed and stored in an array of N DASDs having K blocks per physical track per DASD. The array includes control means for securing synchronous access to selectable ones of the DASDs responsive to each access request. The method involves (a) formatting the blocks onto the array using a row major order modulus as the metric for balancing the data rate and concurrency (the number of DASDs bound per access) and (b) executing the random sequences of large and small access requests over the array.
摘要:
In a log structured array (LSA) storage subsystem, a method for recovering from a storage device failure which incorporates the LSA write and garbage collection procedures, thereby simplifying the recovery process and eliminating the need for dedicated or distributed sparing schemes. Data is distributed across the array in N+P parity groups. Upon a device failure, each lost data block is reconstructed from the remaining blocks of its parity group. The reconstructed block is then placed in the subsystem write buffer to be processed with incoming write data, and new parity is generated for the remaining N-1 data blocks of the group. A lost parity block is replaced by first moving one of the data blocks of its parity group to the write buffer, and then generating new parity for the remaining N-1 data blocks. Also disclosed is a storage subsystem implementing the preceding recovery method.
摘要:
Variable length records can be accessed from an array of N+2 synchronous fixed block formatted DASDs in a single pass and in the presence of a single DASD failure if each record is partitioned into a variable number of K fixed length blocks, the blocks are written on the DASDs in column major order K modulo (N+1), the order is constrained such that the first block of each record resides on the (N+l)st DASD, a parity block for each column resides on an (N+2)nd DASD, and each parity block spans N blocks in the same column from the first N DASDs and one block one column offset thereto on the (N+1)st DASD.
摘要:
A method and system for minimizing seek affinity and enhancing write sensitivity in a direct access storage device (DASD) array are disclosed. SEEK affinity and WRITE efficiency are preserved in which logical cylinders, as recorded on the DASD array, are managed as individual log structured files (LSF). Tracks or segments of data and parity blocks having the same or different parity group affinity and stored on the same or different DASD cylindrical addresses are written into a directory managed buffer. Blocks having the same parity affinity and written to counterpart cylinders are written out from the buffer to spare space reserved as part of each DASD cylinder. Otherwise, blocks are updated in place in their DASD array location.
摘要:
A controller for a disk array with parity and sparing includes a non-volatile cache memory and optimizes the destaging process for blocks from the cache memory to both maximize the cache hit ratio and minimize disk utilization. The invention provides a method for organizing the disk array into segments and dividing the cache memory into groups in order of least recently used memory locations and then determining metrics that permit the disk array controller to identify the cache memory locations having the most dirty blocks by segment and group and to identify the utilization rates of the disks. These characteristics are considered to determine when, what, and how to destage. For example, in terms of maximizing the cache hit ratio, when the percentage of dirty blocks in a particular group of the cache memory locations reaches a predetermined level, destaging is begun. The destaging operation continues until the percentage of dirty blocks decreases to a predetermined level. In terms of minimizing disk utilization, all of the dirty blocks in a segment having the most dirty blocks in a group are destaged.
摘要:
A method for operating a synchronized array of fixed block (FBA) formatted Direct Access Storage Devices (DASDs) to store and update variable-length (CKD) formatted records. This method is suitable for use with DASDs that obtain high recording density by using read and write head technology requiring "micro-jogging" to adjust for differing read and write head alignment or banded disk architecture having a higher block count in the outer tracks than in the inner tracks. Magneto-resistive heads may require micro-jogging to realign the write head for recording after reading the physical track location. The invention employs a DASD staggered array architecture having logical tracks consisting of diagonal-major sequences of consecutive blocks arranged in a predetermined wrap-around manner such as a topological cylinder or torus. The minimum necessary number of DASDs (N) in the staggered array is limited by the fixed block size (B), the interblock gap size (G), the average DASD data transfer rate (D), and the micro-jog delay time (T). A (N+1).sup.th DASD may be added to record the parity of each diagonal-major sequence for improved fault-tolerance.
摘要:
A method and apparatus teaching insertion of addressing indirection to form and to access an array hierarchy expressly permitting the concurrency of a high level RAID array, the bandwidth and degraded mode operation sustainable by a lower level RAID array, and after a DASD failure minimum spanning involvement when the array is rebuilding and rewriting missing data to a spare logical device. Also, disclosed are the accessing of variable length records on the array hierarchy; array hierarchy in which RAID 5 arrays have dissimilar number of logic devices (lower level RAID arrays) and interleave depths; formation of logical arrays using fractional storage defined onto real DASD subsets; and the defining of logical devices onto DASDs distributed in the same or different physical clusters of DASDs and the rebuild operation thereof.