Abstract:
A computer storage system includes a controller and a storage device array. The storage device array may include a first sub-array and a fast storage device sub-array. The first sub-array includes one or more first storage devices storing data. The fast storage device sub-array includes one or more fast storage devices storing a copy of the data stored in the first sub-array.
Abstract:
An embodiment of a method of caching data writes data units into a write cache for eventual flushing to storage. The method sets a copy-to-read-cache flag for each particular data unit that is read from the write cache. Upon flushing each data unit to the storage, the method copies the data unit to a read cache if the flag for the data unit is set. Another embodiment of a method of caching data writes data units into a write cache. The method simulates a transfer policy for copying the data units from the write cache to a read cache to determine a performance indicator for the transfer policy. Upon flushing each data unit, the method copies the data unit to the read cache if the performance indicator exceeds a threshold and the transfer policy includes copying the data unit into the read cache.
Abstract:
An embodiment of a method of cooperative caching for a distributed storage system begins with a step of requesting data from storage devices which hold the data. The method continues with a step of receiving any cached blocks and expected response times for providing non-cached blocks from the storage devices. The method concludes with a step of requesting a sufficient number of the non-cached blocks from one or more particular storage devices which provides an expectation of optimal performance.
Abstract:
A method and apparatus is used to divide a storage volume into shards. The division is made using a directed graph having a vertex for each block in the storage volume and directed-edges between pairs of vertices representing a shard of blocks, associating a weight with each directed edge that represents the dissimilarity for the shard of blocks between the corresponding pair of vertices, selecting a maximum number of shards (K) for dividing the storage volume, identifying a minimum aggregate weight associated with a current vertex for a combination of no more than K shards, performing the identification of the minimum aggregate weight for vertices in the directed graph, and picking the smallest aggregated weight associated with the last vertex to determine a sharding that spans the storage volume and provides a minimal dissimilarity among no more than K shards of blocks.
Abstract:
Method and apparatus for distributing storage requests referencing a replicated data set to heterogeneous storage arrays. A workload includes related storage requests that have a common quality-of-service requirement. The performance levels of the storage arrays are monitored in processing the storage requests. The performance levels and quality-of-service requirements are used for distributing the storage requests between the storage arrays.
Abstract:
A method of reading data comprises sending read messages to storage devices holding the stripe and receiving at least a quorum of reply messages. The reply message from the storage device holding the data block includes the data block. The quorum meets a quorum condition of a number such that any two selections of the number of stripe blocks intersect in the minimum number of the stripe blocks needed to decode the stripe. A method of writing data comprises sending query messages to storage devices holding the stripe, receiving a query reply message from each of at least a first quorum of the storage devices, sending modify messages to the storage devices, and receiving a write reply message from each of at least a second quorum of the storage devices. The first and second quorums each meet the quorum condition.
Abstract:
A computer storage system includes a controller and a storage device array. The storage device array may include a first sub-array and a fast storage device sub-array. The first sub-array includes one or more first storage devices storing data. The fast storage device sub-array includes one or more fast storage devices storing a copy of the data stored in the first sub-array.
Abstract:
An embodiment of a method of writing data begins with a first step of generating a timestamp. A second step issues a query that includes the timestamp to each of a plurality of primary storage devices. The method continues with a third step of receiving a query reply from at least a quorum of the primary storage devices. The query replies indicate that the timestamp is later than an existing timestamp for the data. In a fourth step, the data is mirrored to secondary storage after receiving the query reply from at least the quorum of the primary storage devices. Upon receiving a mirror completion message from the secondary storage, a fifth step issues a write message that includes at least a portion of the data and the timestamp to each of the primary storage devices.
Abstract:
An embodiment of a method of writing data includes issuing write messages to a replica set of storage devices. Write confirmations are received from at least a majority of the storage devices. An embodiment of a method reading data includes issuing read messages to a replica set of storage devices. Read confirmations are received from at least a first majority of the storage devices. Read commit messages are issued to the storage devices. Commit confirmations are received from at least a second majority of the storage devices.
Abstract:
Data blocks are read from a distributed cache. The distributed cache comprises m replicated caches, each replicated cache including a plurality of independent computing devices. Each independent computing device of the replicated caches holds a replica of a particular one of the m data blocks in memory. The m data blocks and p parity blocks are stored across m plus p independent computing devices. Each of the m plus p independent computing devices stores a single block selected from the m data blocks and the p parity blocks.
Abstract translation:从分布式缓存读取数据块。 分布式缓存包括m个复制高速缓存,每个复制缓存包括多个独立的计算设备。 复制高速缓存的每个独立计算设备保存存储器中m个数据块中的特定一个的副本。 m个数据块和p个奇偶校验块存储在m + p个独立计算设备上。 m + p个独立计算装置中的每一个存储从m个数据块和p个奇偶校验块中选择的单个块。