Abstract:
A storage system writes an object across zones of a set of zones (“zone set”). Each zone of a zone set is contributed from an independently accessible storage medium. To create a zone set, the storage system arbitrarily selects disks to contribute a zone for membership in the zone set. This results in a fairly even distribution of zone sets throughout the storage system, which increases fault tolerance of the storage system. Although disk selection for zone set membership is arbitrary, the arbitrary selection can be from a pool of disks that satisfy one or more criteria (e.g., health or activity based criteria). In addition, weights can be assigned to disks to influence the arbitrary selection. Although manipulating the arbitrary selection with weights or by reducing the pool of disks reduces the arbitrariness, this evenly distributes zone sets while accounting for client demand and/or disk health.
Abstract:
A durable file system has been designed for storage devices that do not support write in place and/or that are susceptible to errors or failures. The durable file system also facilitates organization and access of large objects (e.g., gigabytes to terabytes in size). Since the write of a large object often involves multiple write operations, the writing is also referred to as “ingesting.” When ingesting an object, the durable file system writes the object with indexing information for the object to persistent storage across multiple zones that each map to an independently accessible storage medium (e.g., disks on different spindles). After persisting the indexing information with the object, the durable file system updates a file system index in working memory (e.g., non-volatile system memory) with the indexing information for the object.
Abstract:
A durable file system has been designed for storage devices that do not support write in place and/or that are susceptible to errors or failures. The durable file system also facilitates organization and access of large objects (e.g., gigabytes to terabytes in size). The durable file system can efficiently reclaim storage space at zone set granularity since each constituent zone can be reclaimed concurrently when the zone set is chosen for space reclamation. Furthermore, space reclamation for the durable file system does not interfere with object availability because the object data is available throughout reclamation. The durable file system copies data of a live object to a different zone set and updates the file system index before reclaiming the target zone set (e.g., before resetting write pointers to the constituent zones).
Abstract:
A data management services architecture includes architectural components that run in both a storage and compute domains. The architectural components redirect storage requests from the storage domain to the compute domain, manage resources allocated from the compute domain, ensure compliance with a policy that governs resource consumption, deploy program code for data management services, dispatch service requests to deployed services, and monitor deployed services. The architectural components also include a service map to locate program code for data management services, and service instance information for monitoring deployed services and dispatching requests to deployed services. Since deployed services can be stateless or stateful, the services architecture also includes state data for the stateful services, with supporting resources that can expand or contract based on policy and/or service demand. The architectural components also include containers for the deployed services.
Abstract:
A data management services architecture includes architectural components that run in both a storage and compute domains. The architectural components redirect storage requests from the storage domain to the compute domain, manage resources allocated from the compute domain, ensure compliance with a policy that governs resource consumption, deploy program code for data management services, dispatch service requests to deployed services, and monitor deployed services. The architectural components also include a service map to locate program code for data management services, and service instance information for monitoring deployed services and dispatching requests to deployed services. Since deployed services can be stateless or stateful, the services architecture also includes state data for the stateful services, with supporting resources that can expand or contract based on policy and/or service demand. The architectural components also include containers for the deployed services.
Abstract:
A data management services architecture includes architectural components that run in both a storage and compute domains. The architectural components redirect storage requests from the storage domain to the compute domain, manage resources allocated from the compute domain, ensure compliance with a policy that governs resource consumption, deploy program code for data management services, dispatch service requests to deployed services, and monitor deployed services. The architectural components also include a service map to locate program code for data management services, and service instance information for monitoring deployed services and dispatching requests to deployed services. Since deployed services can be stateless or stateful, the services architecture also includes state data for the stateful services, with supporting resources that can expand or contract based on policy and/or service demand. The architectural components also include containers for the deployed services.
Abstract:
A data storage system uses the free space that is not yet filled with data after the deployment of the data store. The free space is used to store additional ‘opportunistic’ protection information for stored data, possibly above and beyond the specified protection level. As the system fills up, the additional protection information is deleted to make room for more data and specified protection information.
Abstract:
An archival cloud storage service can be created with cost efficient components for large scale data storage and can efficiently use these components. A frontend of the cloud storage service presents an asynchronous storage interface to consuming devices of the cloud storage service. Providing an asynchronous storage service interface avoids at least some of the state data overhead that accompanies a time constrained interface (e.g., a request-response based interface with timeouts in seconds). Backend nodes of the cloud storage service periodically query the frontend servers to select requests that the backend nodes can fulfill. Each backend node selects requests based on backend characteristics information, likely dynamic characteristics, of the backend node. Thus, the storage system underlying the cloud storage service can be considered a self-organizing storage system.
Abstract:
A data storage system uses the free space that is not yet filled with data after the deployment of the data store. The free space is used to store additional ‘opportunistic’ protection information for stored data, possibly above and beyond the specified protection level. As the system fills up, the additional protection information is deleted to make room for more data and specified protection information.
Abstract:
A rebuild node of a storage system can assess risk of the storage system not being able to provide a data object. The rebuild node(s) uses information about data object fragments to determine health of a data object, which relates to the risk assessment. The rebuild node obtains object fragment information from nodes throughout the storage system. With the object fragment information, the rebuild node(s) can assess object risk based, at least in part, on the object fragments indicated as existing by the nodes. To assess object risk, the rebuild node(s) treats absent object fragments (i.e., those for which an indication was not received) as lost. When too many object fragments are lost, an object cannot be rebuilt. The erasure coding technique dictates the threshold number of fragments for rebuilding an object. The risk assessment per object influences rebuild of the objects.