Abstract:
An improved method and apparatus for creating a snapshot of a file system. A record of which blocks are being used by a snapshot is included in the snapshot itself, allowing effectively instantaneous snapshot creation and deletion. The state of the active file system is described by a set of metafiles; in particular, a bitmap (henceforth the “active map”) describes which blocks are free and which are in use. The inode file describes which blocks are used by each file, including the metafiles. The inode file itself is described by a special root inode, also known as the “fsinfo block”. The system begins creating a new snapshot by making a copy of the root inode. This copy of the root inode becomes the root of the snapshot.
Abstract:
The present invention provides a system and method for restoring a single file from a snapshot without the need to copy every individual block or inode from the snapshot. A file restore process duplicates the inode of a file within the active file system and performs a reconciliation process between the blocks of the twin inode and the snapshot inode. If the file does not exist within the active file system, a new buffer tree is created that points to the data blocks stored in the snapshot.
Abstract:
An on-disk storage arrangement increases the number of persistent consistency point images (PCPIs) that may be maintained for a volume of a storage system. The on-disk storage arrangement comprises a novel volume information (volinfo) block representing a root of the volume; the volinfo block is stored at predefined locations on disk and comprises various system wide configuration data. The volinfo block further comprises a data structure configured to provide a level of indirection that increases the number of PCPIs maintainable by a file system executing on the storage system. To that end, the data structure may be organized as an array of pointers, wherein each pointer references a block containing a snapshot root, thereby enabling efficient access to each PCPI maintained by the file system.
Abstract:
A data storage system, such as RAID, upgraded dynamically including multiple stages, providing error checking data without taking the system off-line. Checksums are computed from the data and placed in block 63 of the same disk. The combination of parity bits across the parity disk, the remaining uncorrupted data in the data disks, and checksums within each disk includes sufficient information to enable restoration of corrupt data. The system is upgraded by reserving permanent checksum blocks, writing the checksums to a volume block number, and placing the checksums in permanently reserved checksum block locations after first moving data already there to unreserved blocks.
Abstract:
A file system layout apportions an underlying physical volume into one or more virtual volumes (vvols) of a storage system. The underlying physical volume is an aggregate comprising one or more groups of disks, such as RAID groups, of the storage system. The aggregate has its own physical volume block number (pvbn) space and maintains metadata, such as block allocation structures, within that pvbn space. Each vvol has its own virtual volume block number (vvbn) space and maintains metadata, such as block allocation structures, within that vvbn space. Notably, the block allocation structures of a vvol are sized to the vvol, and not to the underlying aggregate, to thereby allow operations that manage data served by the storage system (e.g., snapshot operations) to efficiently work over the vvols.
Abstract:
A system and method for performing a distributed consistency check of a clustered file system. File system functions for loading an inode and/or buffer tree are modified so that in response to either of these functions being invoked, a consistency check is performed. The consistency check verifies both local consistency on a node of the clustered file and a distributed check across the nodes of the storage system.
Abstract:
An on-disk storage arrangement increases the number of persistent consistency point images (PCPIs) that may be maintained for a volume of a storage system. The on-disk storage arrangement comprises a novel volume information (volinfo) block representing a root of the volume; the volinfo block is stored at predefined locations on disk and comprises various system wide configuration data. The volinfo block further comprises a data structure configured to provide a level of indirection that increases the number of PCPIs maintainable by a file system executing on the storage system. To that end, the data structure may be organized as an array of pointers, wherein each pointer references a block containing a snapshot root, thereby enabling efficient access to each PCPI maintained by the file system.
Abstract:
A file system receives a request to set a capacity guarantee for a virtual volume associated with a logical aggregation of physical storage. In response, the file system sets the capacity guarantee to indicate that the logical aggregation of physical storage is to provide a specified amount of space to the virtual volume. The amount of space provided to the virtual volume may be based, at least in part, on a guarantee type. The guarantee type may include, for example, volume, file, none, or partial.
Abstract:
A system and method for performing an on-line check of a file system modifies various function calls within a file system layer of a storage operating system so that each time the particular inode is retrieved using the modified function calls, a check is performed on the inode and associated buffer trees before returning the requested inode to the calling process.
Abstract:
A multi-stage technique invalidates and replaces loadable physical volume block numbers (pvbns) stored in indirect blocks of a dual vbn (“flexible”) virtual volume (vvol) of a storage system to enable efficient image transfers and/or fragmentation handling of the flexible vvol. Each loadable pvbn of a pvbn/virtual vbn (vvbn) block pointer pair is converted into a special block pointer having a predefined reserved value that provides a temporary “pvbn_unknown” placeholder until replaced by a real (actual) pvbn. The technique further allows the storage system to serve data from the flexible vvol using the placeholders while the actual pvbns are computed, thereby eliminating latencies associated with completion of actual pvbn replacement for the pvbn_unknown placeholders.