Abstract:
One or more techniques and/or systems are provided for performing host side deduplication. Host side deduplication may be performed upon writeable data within a write request received at a host computing device configured to access data stored by a storage server. The host side deduplication may be performed at the host computing device to determine whether the writeable data is already stored by the storage server based upon querying a host side cache comprising data stored by a storage server and/or a data structure comprising unique signatures of data stored by the storage server. If the writeable data is stored by the storage server, then a deduplication notification excluding the writeable data may be sent to the storage server, otherwise a write command comprising the writeable data may be sent. Accordingly, unnecessary network traffic of redundant data already stored by the storage server may be reduced.
Abstract:
Systems and methods for efficiently using solid-state devices are provided. Some embodiments provide for a data processing system that uses a non-volatile solid state device as a circular log, with the goal of aligning data access patterns to the underlying, hidden device implementation, in order to maximize performance. In addition, metadata can be interspersed with data in order to align data access patterns to the underlying device implementation. Multiple input/output (I/O) buffers can also be used to pipeline insertions of metadata and data into a linear log. The observed queuing behavior of the multiple I/O buffers can be used to determine when the utilization of the storage device is approaching saturation (e.g., in order to predict excessively-long response times). Then, the I/O load on the storage device may be shed when utilization approaches saturation. As a result, the overall response time of the system is improved.
Abstract:
One or more techniques and/or systems are provided for performing host side deduplication. Host side deduplication may be performed upon writeable data within a write request received at a host computing device configured to access data stored by a storage server. The host side deduplication may be performed at the host computing device to determine whether the writeable data is already stored by the storage server based upon querying a host side cache comprising data stored by a storage server and/or a data structure comprising unique signatures of data stored by the storage server. If the writeable data is stored by the storage server, then a deduplication notification excluding the writeable data may be sent to the storage server, otherwise a write command comprising the writeable data may be sent. Accordingly, unnecessary network traffic of redundant data already stored by the storage server may be reduced.
Abstract:
One or more techniques and/or systems are provided for performing host side deduplication. Host side deduplication may be performed upon writeable data within a write request received at a host computing device configured to access data stored by a storage server. The host side deduplication may be performed at the host computing device to determine whether the writeable data is already stored by the storage server based upon querying a host side cache comprising data stored by a storage server and/or a data structure comprising unique signatures of data stored by the storage server. If the writeable data is stored by the storage server, then a deduplication notification excluding the writeable data may be sent to the storage server, otherwise a write command comprising the writeable data may be sent. Accordingly, unnecessary network traffic of redundant data already stored by the storage server may be reduced.
Abstract:
A write request is received to write a data block having a logical block address to a nonvolatile storage device. The method includes writing a value of the data block to the nonvolatile storage device. The writing includes locating a position in a tree-based data structure that includes first and second nodes. The first node is configured to store a first set of data blocks having logical block addresses in a first numerical range, and the second node is configured to store a second set of data blocks having logical block addresses in a second numerical range. The position is located in the first node or the second node depending on the value of the logical block address. The writing includes storing the value of the data block in the position in the tree-based data structure.
Abstract:
A write request is received to write a data block having a logical block address to a nonvolatile storage device. The method includes writing a value of the data block to the nonvolatile storage device. The writing includes locating a position in a tree-based data structure that includes first and second nodes. The first node is configured to store a first set of data blocks having logical block addresses in a first numerical range, and the second node is configured to store a second set of data blocks having logical block addresses in a second numerical range. The position is located in the first node or the second node depending on the value of the logical block address. The writing includes storing the value of the data block in the position in the tree-based data structure.
Abstract:
Technology is disclosed for improving the storage efficiency and communication efficiency for a storage client device by maximizing the cache hit rate and minimizing data requests to the storage server. The storage server provides a duplication list to the storage client device. The duplication list contains references (e.g. storage addresses) to data blocks that contain duplicate data content. The storage client uses the duplication list to improve the cache hit rate. The duplication list is pruned to contain references to data blocks relevant to the storage client device. The storage server can prune the duplication list based on a working set of storage objects for a client. Alternatively, the storage server can prune the duplication list based on content characteristics, e.g. duplication degree and access frequency. Duplicate blocks to which the client does not have access can be excluded from the duplication list.