摘要:
An architecture for error log processing is provided. Each error log is given a defined priority and mapped to an error recovery procedure (ERP) to be run if the log is seen. The system has a plurality of software layers to process the errors. Each software layer processes the error independently. Errors are reported to a higher software stack when error recovery fails from the lower stack ERPs and recovery is non-transparent. If the system host identified for error processing fails, the control of the ERP is transferred during the failover process. Non-obvious failed component isolating ERPs are grouped to be run together to assist in isolating the failed component. Prioritization of the error systems may be based on a plurality of criteria. ERPs are assigned to run within a particular software stack.
摘要:
An apparatus for parity data management receives a write command and write data from a computing device. The apparatus also builds a parity control structure corresponding to updating a redundant disk array with the write data and stores the parity control structure in a persistent memory buffer of the computing device. The apparatus also updates the redundant disk array with the write data in accordance with a parity control map and restores the RAID controller parity map from the parity control structure as part of a data recovery operation if updating the redundant disk array with the write data is interrupted by a RAID controller failure resulting in a loss of the RAID controller parity map. In certain embodiments, the parity control structure is a RAID controller parity map.
摘要:
An apparatus, system, and method quickly backs up data in an emergency situation and reduces battery backup dependence. The apparatus may include a backup module and a dedicated computer readable storage device. The backup module interfaces with system memory and selectively transmits modified data to the storage device in response to a detected system failure. The dedicated storage device stores the modified data around the outer edge of a hard disk in order to increase write performance. The system may include the backup module, the storage device, a plurality of client devices, and a plurality of storage devices. The method includes storing modified and unmodified data, detecting a system failure, and transmitting modified data stored in a memory module to a dedicated computer readable backup device. Upon rebooting the device, the method may include restoring the modified data to the system memory and destaging the modified data to the plurality of storage devices.
摘要:
Provided is a method, system, and program for processing Input/Output (I/O) requests to a storage network including at least one storage device and at least two adaptors, wherein each adaptor is capable of communicating I/O requests to the at least one storage device. An error is detected in a system including a first adaptor, wherein the first adaptor is capable of communicating on the network after the error is detected. In response to detecting the error, a monitoring state is initiated to monitor I/O requests transmitted through a second adaptor. In response to receiving an I/O request, an I/O delay timer is started that is less than a system timeout period. After the system timeout period the error recovery process in the system including the first adaptor would complete. A reset request is sent to the first adaptor in response to detecting an expiration of one started I/O delay timer.
摘要:
For preventing data loss in storage systems a detection is made that a storage device in a plurality of storage devices is experiencing a malfunction. The type of malfunction is determined. A SMART rebuilding technique, a normal building technique, a data migration technique, or a user data backup technique is selected to preserve the data in the storage device based on the determined type of the malfunction. The selected technique is performed on the storage device.
摘要:
Provided are a computer program product, system and method for a redundant key server encryption environment. A key server transmits public keys associated with the key server and at least one device to at least one remote key server. The key server receives from the at least one remote key server public keys associated with the at least one remote key server. The key server receives a request for an encryption key from a requesting device comprising one of the at least one device and generates the encryption key for use by the requesting device to unlock a storage. The key server generates a first wrapped encryption key by encrypting the encryption key with a requesting device public key associated with the requesting device. The key server generates a second wrapped encryption key by encrypting the encryption key with a public key associated with the key server. At least one additional wrapped encryption key is generated for each of the at least one remote key server by encrypting the encryption key with the at least one public key provided by the at least one remote key server. The key server transmits the first, second and the at least one additional wrapped encryption key to the requesting device.
摘要:
For preventing data loss in storage systems, a detection is made that a storage device in a plurality of storage devices is experiencing a malfunction. The type of malfunction is determined. A SMART rebuilding technique, a normal building technique, a data migration technique, or a user data backup technique is selected to preserve the data in the storage device based on the determined type of the malfunction. The selected technique is performed on the storage device.
摘要:
Provided are a computer program product, system and method for a redundant key server encryption environment. A key server receives from at least one remote key server public keys associated with the at least one remote key server. The key server receives a request for an encryption key from a requesting device and generates the encryption key for use by the requesting device to unlock a storage. The key server generates a first wrapped encryption key by encrypting the encryption key with a requesting device public key, a second wrapped encryption key by encrypting the encryption key with a public key associated with the key server, and at least one additional wrapped encryption key by encrypting the encryption key with the at least one public key provided by the at least one remote key server. The key server transmits the generated keys to the requesting device.
摘要:
A computational device receives input information on characteristics of customer data, critical metadata, and non-critical metadata, and characteristics of disk array configurations, wherein customer data is to be stored encrypted, wherein critical metadata is to be stored non-encrypted, and wherein non-critical metadata is to be stored encrypted or non-encrypted. The computational device determines band boundary information based on the received input information. Encrypting disks with pre-established bands are created based on the band boundary information and the encrypting disks are pre-initialized.
摘要:
Provided are a method, system, and article of manufacture for using device status information to takeover control of devices assigned to a node. A first processing unit communicates with a second processing unit. The first processing unit uses a first device accessible to both the first and second processing units and the second processing unit uses a second device accessible to both the first and second processing units. The first processing unit receives status on the second device from the first device indicating whether the second device is available or unavailable. The first processing unit detects a failure of the second processing unit and determines from the received status on the second device whether the first device is available in response to detecting the failure of the second processing unit. The first processing unit configures the second device for use by the first processing unit in response to determining that the received status on the second device indicates that the second device is available and in response to detecting the failure.