摘要:
An extent-based storage architecture is implemented by a storage server. The storage server generates a new extent identifier for cloning a source extent identified by a source extent identifier and stored at a source data structure that includes a length value providing a length of the source extent, an offset value and a reference count value that provides a number of data containers that reference the source extent identifier. The storage server updates a data structure for a cloned version of the data container for storing the new extent identifier that points to the source extent identifier and includes an extent length value and offset value different from length value and the offset value of the source data structure.
摘要:
A method for efficiently handling partial write requests in a storage system includes allocating a new block of data for the new partial data, and allocating a record in an extent map to record the location of the new partial data block, the location of the old partial data block and the offset length for each data block. Data blocks can be repackaged in the background when system resources are available. A full, but misaligned write request is also efficiently handled by writing the new data to a newly allocated data block and allocating new records in an extent map to record information corresponding to two partial write operations.
摘要:
Overwriting part of compressed data without decompressing on-disk compressed data is implemented by receiving a write request for a block of data in a compression group from a client, wherein the compression group comprises a group of data blocks that is compressed, wherein the block of data is uncompressed. The storage server partially overwrites the compression group, wherein the compression group remains compressed while the partial overwriting is performed. The storage server determines whether the partially overwritten compression group including the uncompressed block of data should be compressed. The storage server defers compression of the partially overwritten compression group if the partially overwritten compression group should not be compressed. The storage server compresses the partially overwritten compression group if the partially overwritten compression group should be compressed.
摘要:
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.
摘要:
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.
摘要:
An extent-based storage architecture is implemented by a storage server receiving a read request for an extent from a client, wherein the extent includes a group of contiguous blocks and the read request includes a file block number. The storage server retrieves an extent identifier from a first sorted data structure, wherein the storage server uses the received file block number to traverse the first sorted data structure to the extent identifier. The storage server retrieves a reference to the extent from a second sorted data structure, wherein the storage server uses the retrieved extent identifier to traverse the second sorted data structure to the reference, and wherein the second sorted data structure is global across a plurality of volumes. The storage server retrieves the extent from a storage device using the reference and returns the extent to the client.
摘要:
An extent-based storage architecture is implemented by a storage server receiving a read request for an extent from a client, wherein the extent includes a group of contiguous blocks and the read request includes a file block number. The storage server retrieves an extent identifier from a first sorted data structure, wherein the storage server uses the received file block number to traverse the first sorted data structure to the extent identifier. The storage server retrieves a reference to the extent from a second sorted data structure, wherein the storage server uses the retrieved extent identifier to traverse the second sorted data structure to the reference, and wherein the second sorted data structure is global across a plurality of volumes. The storage server retrieves the extent from a storage device using the reference and returns the extent to the client.
摘要:
Provided is a method and system for reducing duplicate buffers in buffer cache associated with a storage device. Reducing buffer duplication in a buffer cache includes accessing a file reference pointer associated with a file in a deduplicated filesystem when attempting to load a requested data block from the file into the buffer cache. To determine if the requested data block is already in the buffer cache, aspects of the invention compare a fingerprint that identifies the requested data block against one or more fingerprints identifying a corresponding one or more sharable data blocks in the buffer cache. A match between the fingerprint of the requested data block and the fingerprint from a sharable data block in the buffer cache indicates that the requested data block is already loaded in buffer cache. The sharable data block in buffer cache is used instead thereby reducing buffer duplication in the buffer cache.
摘要:
A method for information management comprises intercepting an output from an application; distributing packets according to a routing scheme, wherein the packets are associated with the output, and wherein distributing the packets may occur when the application is associated with a first operating system, and may also occur when the application is associated with a second operating system; and storing the packets.
摘要:
A method for information management comprises monitoring output from an application to an operating system, wherein the output is monitored substantially continuously; determining if a policy applies to data associated with the output; and executing the policy if the policy applies.