摘要:
The present invention is a method for providing error correction for an array of disks (828) using non-volatile random access memory (NV-RAM) (816). Non-volatile RAM (816) is used to increase the speed or RAID recovery from a disk error(s). This is done by listing of all blocks for which the parity is possibly inconsistent. This list of blocks is much smaller than the total number of parity blocks (820) in the RAID. The total number of parity blocks (820) in the RAID (828) is in the range of hundreds of thousands. Knowledge of the number of parity blocks that are possibly inconsistent makes it possible to fix only those few blocks, identified in the list, resulting in a significant time savings. The technique for safely writing to a RAID with a broken disk is complicated. In this technique, data that can become corrupted is copied into NV-RAM (816) before the potentially corrupting operation is performed.
摘要:
The present invention is a method for integrating a file system with a RAID array (1030) that exports precise information about the arrangement of data blocks in the RAID subsystem (1030). The system uses explicit knowledge of the underlying RAID disk layout to schedule disk allocation. The present invention uses separate current-write location (CWL) pointers for each disk (1022) in the disk array (1030) where the pointers simply advance through disks (1022) as writes occur. The algorithm used has two primary goals. The first goal is to keep the CWL pointers as close together as possible, thereby improving RAID (1030) efficiency by writing to multiple blocks in the stripe simultaneously. The second goal is to allocate adjacent blocks of a file on the same disk (1022), thereby improving read back performance. The first goal is satisfied by always writing on the disk (1022) with the lowest CWL pointer. For the second goal, another disk (1024) is chosen only when the algorithm starts allocating space for a new file, or when it has allocated N blocks on the same disk (1022) for a single file. The result is that CWL pointers are never more than N blocks apart on different disks (1024), and large files have N consecutive blocks on the same disk (1022).
摘要:
The present invention is a method for providing error correction for an array of disks (828) using non-volatile random access memory (NV-RAM) (816). Non-volatile RAM (816) is used to increase the speed or RAID recovery from a disk error(s). This is done by listing of all blocks for which the parity is possibly inconsistent. This list of blocks is much smaller than the total number of parity blocks (820) in the RAID. The total number of parity blocks (820) in the RAID (828) is in the range of hundreds of thousands. Knowledge of the number of parity blocks that are possibly inconsistent makes it possible to fix only those few blocks, identified in the list, resulting in a significant time savings. The technique for safely writing to a RAID with a broken disk is complicated. In this technique, data that can become corrupted is copied into NV-RAM (816) before the potentially corrupting operation is performed.
摘要:
The present invention is a method for integrating a file system with a RAID array (1030) that exports precise information about the arrangement of data blocks in the RAID subsystem (1030). The system uses explicit knowledge of the underlying RAID disk layout to schedule disk allocation. The present invention uses separate current-write location (CWL) pointers for each disk (1022) in the disk array (1030) where the pointers simply advance through disks (1022) as writes occur. The algorithm used has two primary goals. The first goal is to keep the CWL pointers as close together as possible, thereby improving RAID (1030) efficiency by writing to multiple blocks in the stripe simultaneously. The second goal is to allocate adjacent blocks of a file on the same disk (1022), thereby improving read back performance. The first goal is satisfied by always writing on the disk (1022) with the lowest CWL pointer. For the second goal, another disk (1024) is chosen only when the algorithm starts allocating space for a new file, or when it has allocated N blocks on the same disk (1022) for a single file. The result is that CWL pointers are never more than N blocks apart on different disks (1024), and large files have N consecutive blocks on the same disk (1022).
摘要:
The present invention is a method for providing error correction for an array of disks (828) using non-volatile random access memory (NV-RAM) (816). Non-volatile RAM (816) is used to increase the speed or RAID recovery from a disk error(s). This is done by listing of all blocks for which the parity is possibly inconsistent. This list of blocks is much smaller than the total number of parity blocks (820) in the RAID. The total number of parity blocks (820) in the RAID (828) is in the range of hundreds of thousands. Knowledge of the number of parity blocks that are possibly inconsistent makes it possible to fix only those few blocks, identified in the list, resulting in a significant time savings. The technique for safely writing to a RAID with a broken disk is complicated. In this technique, data that can become corrupted is copied into NV-RAM (816) before the potentially corrupting operation is performed.
摘要:
The present invention provides a method for keeping a file system in a consistent state and for creating read-only copies of a file system. Changes to the file system are tightly controlled. The file system progresses from one consistent state to another. The set of self-consistent blocks on disk that is rooted by the root inode is referred to as a consistency point. To implement consistency points, new data is written to unallocated blocks on disk. A new consistency point occurs when the fsinfo block (2440) is updated by writing a new root inode for the inode file (1210) into it. Thus, as long as the root inode is not updated, the state of the file system represented on disk does not change. The present invention also creates snapshots (Figure 22) that are read-only copies of the file system. A snapshot uses no disk space when it is initially created. It is designed so that many different snap shots can be created for the same file system. Unlike prior art file systems that create a clone by duplicating the entire inode file and all of the indirect blocks, the present invention duplicates only the inode that describes the inode file. A multi-bit free-block map file (1630) is used to prevent data from being overwritten on disk.
摘要:
The present invention provides a method for keeping a file system in a consistent state and for creating read-only copies of a file system. Changes to the file system are tightly controlled. The file system progresses from one consistent state to another. The set of self-consistent blocks on disk that is rooted by the root inode is referred to as a consistency point. To implement consistency points, new data is written to unallocated blocks on disk. A new consistency point occurs when the fsinfo block (2440) is updated by writing a new root inode for the inode file (1210) into it. Thus, as long as the root inode is not updated, the state of the file system represented on disk does not change. The present invention also creates snapshots (Figure 22) that are read-only copies of the file system. A snapshot uses no disk space when it is initially created. It is designed so that many different snap shots can be created for the same file system. Unlike prior art file systems that create a clone by duplicating the entire inode file and all of the indirect blocks, the present invention duplicates only the inode that describes the inode file. A multi-bit free-block map file (1630) is used to prevent data from being overwritten on disk.