Abstract:
The techniques and/or systems described herein implement erasure coding to generate various chunks for a data collection (e.g., data chunks and at least one encoding chunk). The chunks are then distributed and stored within an individual group (e.g., a pod) of storage units, where a pod of storage units is determined based on characteristics that affect an amount of time it takes to recover a data collection or to restore lost data.
Abstract:
In various embodiments, methods and systems for implementing distributed data object management are provided. The distributed data object management system includes a local metadata-consensus information store and one or more remote metadata-consensus information stores for metadata-consensus information and a local data store and one or more remote data stores for erasure coded fragments. For a write operation, corresponding metadata writes and data writes are performed in parallel using a metadata write path and a data write path, respectively, when writing to the local metadata-consensus information store and the one or more remote metadata-consensus information stores and the local data store and the one or more remote data stores. And, for a read operation, corresponding metadata reads and data reads are performed in parallel using a metadata read path and a data read path, respectively, when reading from the metadata-consensus information stores and the data stores.
Abstract:
Various embodiments, methods and systems for implementing distributed data synchronization in a distributed computing system, are provided. In operation, a data record of a first data set is accessed. The data record is encoded to generate, for a first distributed invertible bloom filter (“DIBF”) data structure, a first DIBF record. The first DIBF record comprises a data field and a quantifier field that includes a quantifier value, which represents a reference count for the first DIBF record. The first and second DIBF data structures are accessed and decoded based at least in part on computing a difference between a quantifier value in the first DIBF data structure and a quantifier value in the second DIBF data structure. A determination whether a match exists between the first DIBF data structure and second DIBF data structure is made based on computing the difference between the first and second DIBF data structures.
Abstract:
In various embodiments, methods and systems for implementing garbage collection in distributed storage systems are provided. The distributed storage system operates based on independent management of metadata of extent and stream data storage resources. A hybrid garbage collection system based on reference counting garbage collection operations and mark-and-sweep garbage collection operations is implemented. An extent lifetime table that tracks reference weights and mark sequences for extents is initialized and updated based on indications from extent managers and stream managers, respectively. Upon determining that an extent is to be handed-off from weighted reference counting garbage collection operations to mark-and-sweep garbage collection operations, a reference weight field for the extent is voided and a mark sequence field of the extent is updated. The mark sequence field is updated with a latest global sequence number. The mark-and-sweep garbage collection operations are utilized to reclaim the extent when the extent is no longer referenced.
Abstract:
In various embodiments, methods and systems for implementing distributed data object management are provided. The distributed data object management system includes a distributed storage system having a local metadata-consensus information store in and one or more remote metadata-consensus information stores. A metadata-consensus information store is configured to store metadata-consensus information. The metadata-consensus information corresponds to erasure coded fragments of a data object and instruct on how to manage the erasure coded fragments. The distributed storage system further includes a local data store and one or more remote data stores for the erasure coded fragments. The distributed data object management system includes a distributed data object manager for operations including, interface operations, configuration operations, write operations, read operations, delete operations, garbage collection operations and failure recovery operations. The distributed data object management system is operates based on metadata paths and data paths, operating in parallel, for write operations and read operations.
Abstract:
A “Media Sharer” operates within peer-to-peer (P2P) networks to provide a dynamic peer-driven system for streaming high quality multimedia content, such as a video-on-demand (VoD) service, to participating peers while minimizing server bandwidth requirements. In general, the Media Sharer provides a peer-assisted framework wherein participating peers assist the server in delivering on-demand media content to other peers. Participating peers cooperate to provide at least the same quality media delivery service as a pure server-client media distribution. However, given this peer cooperation, many more peers can be served with relatively little increase in server bandwidth requirements. Further, each peer limits its assistance to redistributing only portions of the media content that it also receiving. Peer upload bandwidth for redistribution is determined as a function of surplus peer upload capacity and content need of neighboring peers, with earlier arriving peers uploading content to later arriving peers.
Abstract:
The techniques and/or systems described herein implement erasure coding to generate various chunks for a data collection (e.g., data chunks and at least one encoding chunk). The chunks are then distributed and stored within an individual group (e.g., a pod) of storage units, where a pod of storage units is determined based on characteristics that affect an amount of time it takes to recover a data collection or to restore lost data.
Abstract:
A first set of replicated state machines includes a first state machine that compares a clock value included in a state update message incremented by a first amount, a clock value for the first state machine incremented by a second amount, and a current local wall clock value for the first state machine to determine a maximum value and assigns the maximum value as the clock value for the first state machine. Additionally, in response to a passage of an amount of time, the first state machine advances the clock value for the first state machine to its current local wall clock value and propagates this clock value to the other state machines in the first set of replicated state machines. The advancement of the clock value for all state machines even in the absence of state updates improves their ability to respond to distributed read requests.
Abstract:
Techniques are provided for the caching of content prior to the content being requested. A request for desired content may be received from a client application at a caching server. The request may also indicate additional content related to the desired content that may be subsequently requested by the client application. The indicated additional content (and the desired content, if not already cached) is retrieved from an origin server. The desired content is transmitted to the client application at the user device, and the additional content is cached at the caching server. Subsequently, a second request may be received from the client application that includes a request for the additional content. The additional content, which is now cached at the caching server, is served to the client application by the caching server in response to the second request (rather than being retrieved from the origin server).
Abstract:
In various embodiments, methods and systems for implementing a distributed metadata management system in distributed storage systems are provided. A distributed storage system operates based on data storage resources (e.g., extents and streams). The distributed metadata management system is implemented for extent and stream metadata to facilitate the scalability of metadata processing. The distributed storage system implements extent managers and stream managers that independently manage extent and stream metadata, respectively. The extent managers are associated with an extent table that stores extent metadata. The stream managers are associated with streams that store associations with extents. The distributed metadata management system can also utilize a bootstrap layer that leverages components of a legacy distributed storage system to facilitate distributed management of extent and stream metadata. The bootstrap layer is used to store the extent table as a system table and to persist the state of the stream manager as system streams.