摘要:
Managing placement of object replicas is performed at a first instance of a distributed storage system. One or more journals are opened for storage of object chunks. Each journal is associated with a single placement policy. A first object is received comprising at least a first object chunk. The first object is associated with a first placement policy. The first object chunk is stored in a first journal whose associated placement policy matches the first placement policy. The first journal stores only object chunks for objects whose placement policies match the first placement policy. For the first journal, the receiving and storing operations are repeated for multiple objects whose associated placement policies match the first placement policy, until a first termination condition occurs. Then, the first journal is closed. Subsequently, the first journal is replicated to a second instance of the distributed storage system according to the first placement policy.
摘要:
A method is performed by two or more devices of a group of devices in a distributed data replication system. The method includes receiving, at the two or more devices, a group of chunks having a same unique temporary identifier, where the group of chunks comprises an object to be uploaded; creating an entry for the object in a replicated index, where the entry is keyed by the unique temporary identifier, and where the replicated index is replicated at each of the two or more devices; and determining, by an initiating device of the two or more devices, that a union of the group of chunks contains all data of the object. The method also includes calculating a content-based identifier to the object; creating another entry for the object in the replicated index, where the other entry is keyed by the content-based identifier; and updating the replicated index to point from the unique temporary identifier to the content-based identifier.
摘要:
A server computer at a first storage sub-system of a distributed storage system receives from a client a first client request for an object. If the object is not present in the first storage sub-system, the server computer identifies a second storage sub-system of the distributed storage system as having a replica of the requested object, the requested object including content and metadata. The server computer submits an object replication request for the requested object to the second storage sub-system and independently receives the content and metadata of the requested object from the second storage sub-system. The server computer generates a new replica of the object at the first storage sub-system using the received metadata and content and returns the metadata of the new replica of the object to the client.
摘要:
A method is performed by a device of a group of devices in a distributed data replication system. The method includes storing an index of objects in the distributed data replication system, the index being replicated while the objects are stored locally by the plurality of devices in the distributed data replication system. The method also includes conducting a scan of at least a portion of the index and identifying a redundant replica(s) of the at least one of the objects based on the scan of the index. The method further includes de-duplicating the redundant replica(s), and updating the index to reflect the status of the redundant replica.
摘要:
A method may be performed by a device of a group of devices in a distributed data replication system. The method may include storing objects in a data store, at least one or more of the objects being replicated with the distributed data replication system, and conducting a scan of the objects in the data store. The method may further include identifying one of the objects as not having a reference pointing to the object, storing a delete negotiation message as metadata associated with the one of the objects, and replicating the metadata with the delete negotiation message to one or more other devices of the group of devices.
摘要:
Placement of object replicas in a distributed storage system includes, at a first instance, opening a journal for storage of object chunks. Each journal is associated with a single placement policy. An object is received, which comprises a chunk. The object has a placement policy, and the chunk comprises a plurality of storage blocks. The blocks are stored in a journal that matches the placement policy. Global metadata for the object is stored, which includes a list of chunks for the object. Local metadata for the chunk is stored, which includes a block list identifying each block of the plurality of blocks. The local metadata is associated with the journal. The journal is later closed. The journal is subsequently replicated to a second instance according to the placement policy. The global metadata is updated to reflect the replication, whereas the local metadata is unchanged by the replication.
摘要:
A system and method for generating replication requests for objects in a distributed storage system is provided. Replication requests for objects in a distributed storage system are generated based at least in part on replication policies for the objects and a current state of the distributed storage system, wherein a respective replication request for a respective object instructs a respective instance of the distributed storage system to replicate the respective object so as to at least partially satisfy a replication policy for the respective object, wherein a respective replication policy includes criteria specifying at least storage device types on which replicas of object are to be stored. At least a subset of the replication requests is then distributed to the respective instances of the distributed storage system for execution.