摘要:
A method and system are provided for partitioning a file system. The system may include one or more server computer systems and a plurality of physical file systems. The physical file systems may be hosted by the one or more server computer systems. The physical file systems may be accessible to clients through a virtual file system having a single namespace. The virtual file system may include metadata which are partitioned across the plurality of physical file systems. The server computer systems may be configured to independently perform file system consistency checks on each of the physical file systems, in order to independently validate each partition of the metadata.
摘要:
A system includes one or more processors configured to redistribute one or more originator data subsets among a plurality of originator nodes and determine data redistribution information pertaining to redistribution of the one or more originator data subsets among the plurality of originator nodes. The system further includes a communication interface configured to send data redistribution information to a replica system. The data redistribution information is used by the replica system to redistribute one or more corresponding replica data subsets among a plurality of replica nodes.
摘要:
An indication is received that a data object is to be deleted, wherein the data object comprises data stored in a segment within a container. It is determined no currently alive data object references any segment within the container. The container is placed in a delete-ready but not yet reclaimable state.
摘要:
Data replication comprises: redistributing one or more originator data subsets among a plurality of originator nodes; determining data redistribution information pertaining to redistribution of the one or more originator data subsets among the plurality of originator nodes; and sending data redistribution information to a replica system. The data redistribution information is used by the replica system to redistribute one or more corresponding replica data subsets among a plurality of replica nodes; and the one or more corresponding replica data subsets are redistributed among the plurality of replica nodes without requiring the one or more originator data subsets to be sent to the replica system during redistribution.
摘要:
Data replication comprises: redistributing one or more originator data subsets among a plurality of originator nodes; determining data redistribution information pertaining to redistribution of the one or more originator data subsets among the plurality of originator nodes; and sending data redistribution information to a replica system. The data redistribution information is used by the replica system to redistribute one or more corresponding replica data subsets among a plurality of replica nodes; and the one or more corresponding replica data subsets are redistributed among the plurality of replica nodes without requiring the one or more originator data subsets to be sent to the replica system during redistribution.
摘要:
Various methods and systems for implementing a file change log in a distributed file system are disclosed. In one embodiment, a method involves operating a distributed file system that presents a namespace and maintaining a file change log for the namespace. Operating the distributed file system involves executing an instance of a file system on each of several nodes. Maintaining the file change log can involve maintaining a single file change log for the namespace. Updates to the single file change log can be handled by a primary node or controlled using a locking mechanism. Alternatively, several private file change logs (e.g., one per node) can be maintained, and these private file change logs can be merged into a single file change log (e.g., by a primary node).
摘要:
Exemplary methods, apparatuses, and systems maintain a plurality of summary data structures corresponding to a plurality of logical file system namespaces representing a plurality of hierarchies of one or more directories having one or more files, each file being stored in the storage system as a plurality of segments in a deduplicated manner. In response to a request to estimate a storage usage by a first of the file system namespace, identify a first of the summary data structures corresponding to the first file system namespace, wherein the first summary data structure stores information summarizing deduplicated segments referenced by one or more files of the first file system namespace. Estimate the storage usage of the first file system namespace based on the first summary data structure and a global summary data structure, wherein the global summary data structure stores information summarizing deduplicated segments referenced by all of the file system namespaces.
摘要:
A system includes one or more processors configured to redistribute one or more originator data subsets among a plurality of originator nodes and determine data redistribution information pertaining to redistribution of the one or more originator data subsets among the plurality of originator nodes. The system further includes a communication interface configured to send data redistribution information to a replica system. The data redistribution information is used by the replica system to redistribute one or more corresponding replica data subsets among a plurality of replica nodes.