Abstract:
One or more techniques and/or systems are provided for performing host side deduplication. Host side deduplication may be performed upon writeable data within a write request received at a host computing device configured to access data stored by a storage server. The host side deduplication may be performed at the host computing device to determine whether the writeable data is already stored by the storage server based upon querying a host side cache comprising data stored by a storage server and/or a data structure comprising unique signatures of data stored by the storage server. If the writeable data is stored by the storage server, then a deduplication notification excluding the writeable data may be sent to the storage server, otherwise a write command comprising the writeable data may be sent. Accordingly, unnecessary network traffic of redundant data already stored by the storage server may be reduced.
Abstract:
Technology is disclosed for improving the storage efficiency and communication efficiency for a storage client device by maximizing the cache hit rate and minimizing data requests to the storage server. The storage server provides a duplication list to the storage client device. The duplication list contains references (e.g. storage addresses) to data blocks that contain duplicate data content. The storage client uses the duplication list to improve the cache hit rate. The duplication list is pruned to contain references to data blocks relevant to the storage client device. The storage server can prune the duplication list based on a working set of storage objects for a client. Alternatively, the storage server can prune the duplication list based on content characteristics, e.g. duplication degree and access frequency. Duplicate blocks to which the client does not have access can be excluded from the duplication list.
Abstract:
A technique described herein performs peer to peer network write deduplication. A host system generates a fingerprint for data associated with a write request. The host system may then determine whether the generated fingerprint matches a local fingerprint stored in a local data structure or whether the generated fingerprint matches a global fingerprint associated with a global data structure, wherein the local fingerprint is associated with data previously written to the storage system by the host and wherein the global fingerprint is associated with data previously written to the storage system by a different host. If a match is found, the host system constructs a deduplication command utilizing a logical address corresponding to a storage location that stores the data. If a match is not found, a write command for the data of the write request is constructed and sent to the storage system.
Abstract:
One or more techniques and/or systems are provided for performing host side deduplication. Host side deduplication may be performed upon writeable data within a write request received at a host computing device configured to access data stored by a storage server. The host side deduplication may be performed at the host computing device to determine whether the writeable data is already stored by the storage server based upon querying a host side cache comprising data stored by a storage server and/or a data structure comprising unique signatures of data stored by the storage server. If the writeable data is stored by the storage server, then a deduplication notification excluding the writeable data may be sent to the storage server, otherwise a write command comprising the writeable data may be sent. Accordingly, unnecessary network traffic of redundant data already stored by the storage server may be reduced.
Abstract:
One or more techniques and/or systems are provided for performing host side deduplication. Host side deduplication may be performed upon writeable data within a write request received at a host computing device configured to access data stored by a storage server. The host side deduplication may be performed at the host computing device to determine whether the writeable data is already stored by the storage server based upon querying a host side cache comprising data stored by a storage server and/or a data structure comprising unique signatures of data stored by the storage server. If the writeable data is stored by the storage server, then a deduplication notification excluding the writeable data may be sent to the storage server, otherwise a write command comprising the writeable data may be sent. Accordingly, unnecessary network traffic of redundant data already stored by the storage server may be reduced.
Abstract:
A technique described herein performs peer to peer network write deduplication. A host system generates a fingerprint for data associated with a write request. The host system may then determine whether the generated fingerprint matches a local fingerprint stored in a local data structure or whether the generated fingerprint matches a global fingerprint associated with a global data structure, wherein the local fingerprint is associated with data previously written to the storage system by the host and wherein the global fingerprint is associated with data previously written to the storage system by a different host. If a match is found, the host system constructs a deduplication command utilizing a logical address corresponding to a storage location that stores the data. If a match is not found, a write command for the data of the write request is constructed and sent to the storage system.
Abstract:
One or more techniques and/or systems are provided for coalescing sequences for host side deduplication. A host device may receive a write command from a client device. The write command may comprise a set of data blocks that are to be written to a storage device. The host device may perform host side deduplication by identifying one or more data blocks of the write command that comprise data already stored by the storage device as storage device data blocks. The host device may evaluate the one or more data blocks to identify adjacent data blocks. The host device may coalesce adjacent data blocks into a deduplication sequence. The host device may issue a host side write deduplication command to the storage device (e.g., through a storage controller) based upon the deduplication sequence, which may improve performance by mitigating a number of commands issued to and/or processed by the storage device.