Abstract:
A storage system provides highly flexible data layouts that can be tailored to various different applications and use cases. The system dynamically balances performance with block sharing, based on service level objectives (SLOs). The system defines several types of data containers, including “regions”, “logical extents” and “slabs”. Each region includes one or more logical extents. Allocated to each logical extent is at least part of one or more slabs allocated to the region that includes the extent. Each slab is a set of blocks of storage from one or more physical storage devices. The slabs can be defined from a heterogeneous pool of physical storage. The system also maintains multiple “volumes” above the region layer. Each volume includes one or more logical extents from one or more regions. Layouts of the extents within the regions are not visible to any of the volumes.
Abstract:
A file system layout apportions an underlying physical volume into one or more virtual volumes (vvols) of a storage system. The underlying physical volume is an aggregate comprising one or more groups of disks, such as RAID groups, of the storage system. The aggregate has its own physical volume block number (pvbn) space and maintains metadata, such as block allocation structures, within that pvbn space. Each vvol has its own virtual volume block number (vvbn) space and maintains metadata, such as block allocation structures, within that vvbn space. Notably, the block allocation structures of a vvol are sized to the vvol, and not to the underlying aggregate, to thereby allow operations that manage data served by the storage system (e.g., snapshot operations) to efficiently work over the vvols.
Abstract:
A plurality of storage devices is organized into a physical volume called an aggregate, and the aggregate is organized into a global storage space, and a data block is resident on one of the storage devices of the plurality of storage devices. A plurality of virtual volumes is organized within the aggregate and he data block is allocated to a virtual volume. A physical volume block number (pvbn) is selected for the data block from a pvbn space of the aggregate, and virtual volume block number (vvbn) for the data block is selected from a vvbn space of the selected vvol. Both the selected pvbn and the selected vvbn are inserted in a parent block as block pointers to point to the allocated data block on the storage device.
Abstract:
A file system determines the relative vacancy of a collection of storage blocks, i.e., an “allocation area”. This is accomplished by recording an array of numbers, each of which describes the vacancy of a collection of storage blocks. The file system examines these numbers when attempting to record file blocks in relatively contiguous areas on a storage medium, such as a disk. When a request to write to disk occurs, the system determines the average vacancy of all of the allocation areas and queries the allocation areas for individual vacancy rates. The system preferably writes file blocks to the allocation areas that are above a threshold related to the average storage block vacancy of the file system.
Abstract:
The present invention relates to a system for restoring a file from a snapshot, where a version of the file exists in both an active file system and the snapshot. A twin inode is created in the active file system and comparisons are made between block pointers of the twin inode and the snapshot. If there is a match, the block pointer of the twin inode is moved to the active file system. If there is not a match, a determination is made whether the snapshot block pointer exists in the active file system. If the snapshot block pointer does not exist in the active file system, it is copied to the active file system. If it does exist, then the actual data block pointed to by the snapshot block pointer is copied to the active file system. In this way, a file may be restored without the need to always copy every individual data block or inode from the snapshot.
Abstract:
A system and method for enabling parallel replay of a backup memory log of client transaction request entries to a network storage appliance file system is provided. The backup memory is typically implemented as a non-volatile random access memory (NVRAM). An initiator establishes a swarm of messages with a plurality of transaction blocks pointing to logged request entries and related states associated therewith. The states represent the various phases of file system recovery and disk storage including a retrieval of disk information (data and meta-data), typically in the form of a LOAD, and a subsequent modify phase. The swarm is transferred to the file system for parallel disk information-retrieval in an interleaved process. Any transactions that cannot be performed due to a required prerequisite action (e.g. a prior file-create) are returned to the initiator for reloading once the prerequisite action has occurred.
Abstract:
A defragmentation technique determines the extent to which data blocks of a file are fragmented on disks of a computer and, in response, efficiently relocates those blocks if such relocation improves the on-disk layout of the file. Each indirect block of the file is examined and the current layout of the range of pointers referencing the data blocks is determined. In addition, the number of operations needed to retrieve those data blocks from disks is calculated. A potential new layout is then estimated based on an average fullness of the file system. If the potential new layout improves the fragmentation of the current layout, then the data blocks for that range are relocated, if there is sufficient free space on disk. Otherwise, the blocks are not relocated and the current on-disk layout of the file is maintained.
Abstract:
A system and method are provided to check consistency of an aggregate capable of supporting flexible volumes. The method includes identifying an inode having a flexible volume type present in the aggregate; determining whether the inode is identified in a metadata directory of the aggregate; and performing consistency check on the flexible volume associated with the inode.
Abstract:
A write allocation technique extends a conventional write allocation procedure employed by a write anywhere file system of a storage system. A write allocator of the file system implements the extended write allocation technique in response to an event in the file system. The extended write allocation technique efficiently allocates blocks, and frees blocks, to and from a virtual volume (vvol) of an aggregate. The aggregate is a physical volume comprising one or more groups of disks, such as RAID groups, underlying one or more vvols of the storage system. The aggregate has its own physical volume block number (pvbn) space and maintains metadata, such as block allocation structures, within that pvbn space. Each vvol also has its own virtual volume block number (vvbn) space and maintains metadata, such as block allocation structures, within that vvbn space. The inventive technique extends input/output efficiencies of the conventional write allocation procedure to comport with an extended file system layout of the storage system.
Abstract:
A method for transferring data of a hybrid virtual volume of a computer data storage system from a source to a destination is disclosed. The method first translates intermingled virtual and physical volume block numbers of the hybrid virtual volume into a data stream having only virtual volume block numbers. The method then sends the data stream to a destination computer.