Abstract:
Approaches to data flow bottleneck management using caching mechanisms in a distributed storage environment are disclosed. A request is received by a first data storage node having a first set of interface components, a first set of data management components, a first advisory cache, and a first set of data storage devices. The request has a corresponding file. The first advisory cache is checked for an entry corresponding to the file. The request is routed based on a file characteristic corresponding to the request if there is no corresponding entry in the first advisory cache or to a second data storage node based on the entry in the first advisory cache. Potential bottleneck conditions are monitored on the first node. An advisory cache entry in the first advisory cache is generated in response to determining that a bottleneck condition exists.
Abstract:
In one embodiment, distributed data storage systems and methods are described for integrating a change tracking manager with scalable databases. According to one embodiment, a computer implemented method comprises managing storage of objects and continuously tracking changes of the objects in a distributed object storage database, creating a record for an object having an object name, the object being stored in a bucket of the distributed object storage database, linking the bucket to a peer bucket based on a directive, generating a peer marker field for the record to store one peer marker of multiple different peer markers depending on a relationship between the bucket and the peer bucket; and automatically adding a work item for the object to the secondary index of a chapter database based on the record being created in the bucket and the peer marker for the peer bucket.
Abstract:
A system, method, and machine-readable storage medium for maintaining an object storage system data are provided. In some embodiments, an object manager may receive a request to perform an operation on an object. The object storage system includes a first database of a first type and a second database of a second type. The object manager may identify a first record stored in the first database. The first record includes a name marker indicating a range of object names covered by the second database and includes a file handle referencing the second database. The range of object names includes the object name. Additionally, the object manager may identify a second record stored in the second database. The second record includes the object name and includes a file handle referencing the object. The object manager may perform the operation on the object in accordance with the request.
Abstract:
Approaches for providing a non-disruptive file move are disclosed. A request to move a target file from the first constituent to the second constituent is received. The file has an associated file handle. The target file in the first constituent is converted to a multipart file in the first constituent with a file location for the new file in the first constituent. A new file is created in the second constituent. Contents of the target file are moved to a new file on the second constituent while maintaining access via the associated file handle via access to the multipart file. The target file is deleted from the first constituent.
Abstract:
In one embodiment, distributed data storage systems and methods integrate a change tracking manager with scalable databases. According to one embodiment, a computer implemented method comprises integrating change tracking of storage objects into the distributed object storage database that includes a first database of a first type and one or more chapter databases of a second type with the distributed object storage database supporting a primary lookup index and a secondary lookup index in order to locate a storage object. The method includes recording in a header of a chapter database a network topology for connecting a bucket having the chapter database to a first peer bucket when a new mirror to the first peer bucket is being established, and recording a first directive into the header of the chapter database to express a type of content to be mirrored from the bucket to the first peer bucket.
Abstract:
Techniques for adding a directory entry to an existing directory data structure maintained by a storage system for storing a plurality of directory entries are provided. A first storage index block is used for storing a pointer to a first hash value from among a plurality of hash values. A second storage index block is allocated when the first storage index block has reached a threshold level for storing pointers to hash values for the plurality of directory entries. A group of pointers including a pointer to a second hash value from among the plurality of hash values is selected. The group of pointers is stored in the second storage index block with a pointer to a third hash value from among the plurality of hash values such that the directory entry can be searched using the plurality of hash values.
Abstract:
A method includes receiving an atomic operation for execution, wherein the execution of the atomic operation is to access a data container stored in more than one data store device of a plurality of data store devices in a distributed storage system. The method includes executing, in response to receiving the atomic operation, a write-back cache operation for the data container to preclude access of the data container by a different operation prior to completion of the atomic operation. The method also includes executing the atomic operation, wherein executing the atomic operation comprises accessing the data container stored in the more than one data store device of the distributed storage system.
Abstract:
Approaches to data flow bottleneck management using caching mechanisms in a distributed storage environment are disclosed. A read request is received by a first data storage node having a first set of interface module(s), a first set of data management module(s), a first redirection layer, and a first set of data storage devices. The read request has a corresponding file to be read. The first redirection layer is checked for an entry corresponding to the file. The read request is routed based on a file characteristic corresponding to the read request if there is no corresponding entry in the first redirection layer or to a second data storage node based on the entry in the first redirection layer. Potential bottleneck conditions are monitored on the first node. A redirection layer entry in the first redirection layer is generated in response to determining that a bottleneck condition exists.
Abstract:
In one embodiment, distributed data storage systems and methods integrate a change tracking manager with scalable databases. According to one embodiment, a computer implemented method comprises integrating change tracking of storage objects into the distributed object storage database that includes a first database of a first type and one or more chapter databases of a second type with the distributed object storage database supporting a primary lookup index and a secondary lookup index in order to locate a storage object. The method includes recording in a header of a chapter database a network topology for connecting a bucket having the chapter database to a first peer bucket when a new mirror to the first peer bucket is being established, and recording a first directive into the header of the chapter database to express a type of content to be mirrored from the bucket to the first peer bucket.
Abstract:
A method includes receiving an atomic operation for execution, wherein the execution of the atomic operation is to access a data container stored in more than one data store device of a plurality of data store devices in a distributed storage system. The method includes executing, in response to receiving the atomic operation, a write-back cache operation for the data container to preclude access of the data container by a different operation prior to completion of the atomic operation. The method also includes executing the atomic operation, wherein executing the atomic operation comprises accessing the data container stored in the more than one data store device of the distributed storage system.