Abstract:
The invention provides a method and system for recovery of file system data in file servers having mirrored file system volumes. The invention makes use of a “snapshot” feature of a robust file system (the “WAFL File System”) disclosed in the Incorporated Disclosures, to rapidly determined which of two or more mirrored volumes is most up-to-date, and which file blocks of the most recent mirrored volume have been changed from each one of the mirrored file systems. In a preferred embodiment, among a plurality of mirrored volumes, the invention rapidly determines which is the most up-to-date by examining a consistency point number maintained by the WAFL File System at each mirrored volume. The invention rapidly pairwise determines what blocks are shared between that most up-to-date mirrored volume and each other mirrored volume, in response to a snapshot of the file system maintained at each mirrored volume and are stored in common pairwise between each mirrored volume and the most up-to-date mirrored volume. The invention re synchronizes only those blocks that have been changed between the common snapshot and the most up-to-date snapshot.
Abstract:
The invention features a method for controlling storage of data in a plurality of storage devices each including storage blocks, for example, in a RAID array. The method includes receiving a plurality of write requests associated with data, and buffering the write requests. A file system defines a group of storage blocks, responsive to disk topology information. The group includes a plurality of storage blocks in each of the plurality of storage devices. Each data block of the data to be written is associated with a respective one of the storage blocks, for transmitting the association to the plurality of storage devices.
Abstract:
In one embodiment, a first storage device and a second storage device form a mirror. When the first storage device loses synchronization with the second storage device, data present in the second storage device but not in the first storage device are identified. The identified data are then copied to the first storage device. In one embodiment, a method of rebuilding data in a storage device includes the act of replacing a failed storage device with a replacement storage device. Up-to-date data for the failed storage device, which may be stored in a corresponding mirror, may then be copied to the replacement storage device. Thereafter, the replacement storage device and any other storage devices that have lost synchronization with their mirror are resynchronized.
Abstract:
The invention features a method for controlling storage of data in a plurality of storage devices each including storage blocks, for example, in a RAID array. The method includes receiving a plurality of write requests associated with data, and buffering the write requests. A file system defines a group of storage blocks, responsive to disk topology information. The group includes a plurality of storage blocks in each of the plurality of storage devices. Each data block of the data to be written is associated with a respective one of the storage blocks, for transmitting the association to the plurality of storage devices.
Abstract:
A method for storing data on a plurality of storage devices of a storage system is disclosed. The data is received as data blocks from a plurality of write requests. The data blocks are saved as buffered data for writing to the storage devices in a single write request. An indication is received indicating the available storage blocks on the plurality of storage devices which are available for writing. The buffered data is associated with selected storage blocks of the storage blocks which are available for writing. The buffered data is written to the selected storage blocks in a single write request.
Abstract:
A method and apparatus for a reliable data storage system using block level checksums appended to data blocks. Files are stored on hard disks in storage blocks, including data blocks and block-appended checksums. The block-appended checksum includes a checksum of the data block, a VBN, a DBN, and an embedded checksum for checking the integrity of the block-appended checksum itself. A file system includes file blocks with associated block-appended checksum to the data blocks. The file blocks with block-appended checksums are written to storage blocks. In a preferred embodiment a collection of disk drives are formatted with 520 bytes of data per sector. For each 4,096-byte file block, a corresponding 64-byte block-appended checksum is appended to the file block with the first 7 sectors including most of the file block data while the 8th sector includes the remaining file block data and the 64-byte block-appended checksum.
Abstract:
The invention provides a method and system for recovery of file system data in file servers having mirrored file system volumes. The invention makes use of a “snapshot” feature of a robust file system (the “WAFL File System”) disclosed in the Incorporated Disclosures, to rapidly determined which of two or more mirrored volumes is most up-to-date, and which file blocks of the most recent mirrored volume have been changed from each one of the mirrored file systems. In a preferred embodiment, among a plurality of mirrored volumes, the invention rapidly determines which is the most up-to-date by examining a consistency point number maintained by the WAFL File System at each mirrored volume. The invention rapidly pairwise determines what blocks are shared between that most up-to-date mirrored volume and each other mirrored volume, in response to a snapshot of the file system maintained at each mirrored volume and are stored in common pairwise between each mirrored volume and the most up-to-date mirrored volume. The invention re synchronizes only those blocks that have been changed between the common snapshot and the most up-to-date snapshot.
Abstract:
According to one or more embodiments of the present invention, a network cache intercepts data requested by a client from a remote server interconnected with the cache through one or more wide area network (WAN) links (e.g., for Wide Area File Services, or “WAFS”). The network cache stores the data and sends the data to the client. The cache may then intercept a first write request for the data from the client to the remote server, and determine one or more portions of the data in the write request that changed from the data stored at the cache (e.g., according to one or more hashes created based on the data). The network cache then sends a second write request for only the changed portions of the data to the remote server.
Abstract:
A semi-static distribution technique distributes parity across disks of an array. According to the technique, parity is distributed (assigned) across the disks of the array in a manner that maintains a fixed pattern of parity blocks among the stripes of the disks. When one or more disks are added to the array, the semi-static technique redistributes parity in a way that does not require recalculation of parity or moving of any data blocks. Notably, the parity information is not actually moved; the technique merely involves a change in the assignment (or reservation) for some of the parity blocks of each pre-existing disk to the newly added disk.
Abstract:
A technique efficiently corrects multiple storage device failures in a storage array. The storage array comprises a plurality of concatenated sub-arrays, wherein each sub-array includes a set of data storage devices and a local parity storage device that stores values used to correct a failure of a single device within a row of blocks, e.g., a row parity set, in the sub-array. Each sub-array is assigned diagonal parity sets identically, as if it were the only one present using a double failure protection encoding method. The array further includes a single, global parity storage device holding diagonal parity computed by logically adding together equivalent diagonal parity sets in each of the sub-arrays.