摘要:
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.
摘要:
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.
摘要:
An extent-based storage architecture is implemented by a storage server. The storage server generates a new extent identifier for cloning a source extent identified by a source extent identifier and stored at a source data structure that includes a length value providing a length of the source extent, an offset value and a reference count value that provides a number of data containers that reference the source extent identifier. The storage server updates a data structure for a cloned version of the data container for storing the new extent identifier that points to the source extent identifier and includes an extent length value and offset value different from length value and the offset value of the source data structure.
摘要:
Overwriting part of compressed data without decompressing on-disk compressed data is implemented by receiving a write request for a block of data in a compression group from a client, wherein the compression group comprises a group of data blocks that is compressed, wherein the block of data is uncompressed. The storage server partially overwrites the compression group, wherein the compression group remains compressed while the partial overwriting is performed. The storage server determines whether the partially overwritten compression group including the uncompressed block of data should be compressed. The storage server defers compression of the partially overwritten compression group if the partially overwritten compression group should not be compressed. The storage server compresses the partially overwritten compression group if the partially overwritten compression group should be compressed.
摘要:
An extent-based storage architecture is implemented by a storage server receiving a read request for an extent from a client, wherein the extent includes a group of contiguous blocks and the read request includes a file block number. The storage server retrieves an extent identifier from a first sorted data structure, wherein the storage server uses the received file block number to traverse the first sorted data structure to the extent identifier. The storage server retrieves a reference to the extent from a second sorted data structure, wherein the storage server uses the retrieved extent identifier to traverse the second sorted data structure to the reference, and wherein the second sorted data structure is global across a plurality of volumes. The storage server retrieves the extent from a storage device using the reference and returns the extent to the client.
摘要:
An extent-based storage architecture is implemented by a storage server receiving a read request for an extent from a client, wherein the extent includes a group of contiguous blocks and the read request includes a file block number. The storage server retrieves an extent identifier from a first sorted data structure, wherein the storage server uses the received file block number to traverse the first sorted data structure to the extent identifier. The storage server retrieves a reference to the extent from a second sorted data structure, wherein the storage server uses the retrieved extent identifier to traverse the second sorted data structure to the reference, and wherein the second sorted data structure is global across a plurality of volumes. The storage server retrieves the extent from a storage device using the reference and returns the extent to the client.
摘要:
A method for efficiently handling partial write requests in a storage system includes allocating a new block of data for the new partial data, and allocating a record in an extent map to record the location of the new partial data block, the location of the old partial data block and the offset length for each data block. Data blocks can be repackaged in the background when system resources are available. A full, but misaligned write request is also efficiently handled by writing the new data to a newly allocated data block and allocating new records in an extent map to record information corresponding to two partial write operations.
摘要:
A technique to name data is disclosed to allow preservation of storage efficiency over a link between a source and a destination in a replication relationship as well as in storage at the destination. The technique allows the source to send named data to the destination once and refer to it by name multiple times in the future, without having to resend the data. The technique also allows the transmission of data extents to be decoupled from the logical containers that refer to the data extents. Additionally, the technique allows a replication system to accommodate different extent sizes between replication source and destination while preserving storage efficiency.
摘要:
A method and apparatus for operating a data storage system is disclosed. An original active file system holds incoming write transactions. Data is written at a selected time to blocks in a data storage device of the original active file system, the data written to blocks which do not hold old data of the data storage system. Pointers to data of the original active file system are written at the selected time to the data storage device, the pointers written to blocks which do not hold old data of the data storage system, the pointers and a previously saved data of the active file system forming a consistency point of the original active file system at the selected time. A new active file system is started using the consistency point of the original active file system at the selected time.
摘要:
The invention provides an improved method and apparatus for creating a snapshot of a file system. A “copy-on-write” mechanism is used. The snapshot uses the same blocks as the active file system until the active file system is modified. Whenever a modification occurs, the modified data is copied to a new block and the old data is saved. In this way, the snapshot only uses space where it differs from the active file system, and the amount of work required to create the snapshot is small. A record of which blocks are being used by the snapshot is included in the snapshot itself, allowing effectively instantaneous snapshot creation and deletion. A snapshot can also be deleted instantaneously simply by discarding its root inode.