Abstract:
In some implementations, a B+tree (b plus tree) can provide concurrent access to data while modifying nodes of the B+tree. In some implementations, a top-down B+tree can be provided where nodes of the B+tree can be proactively merged, rebalanced and split to prevent recursive operations moving up the B+tree. In some implementations, node (or page) record data can be merged to consolidate record entries within nodes of the B+tree while only locking 1-3 nodes of the tree at the same time. In some implementations, record data can be merged across multiple nodes of the B+tree. In some implementations, ranges of data can be removed from the tree while only locking 1-3 nodes of the tree at the same time. In some implementations, range of data can be replaced with new data while only locking 1-3 nodes of the tree at the same time.
Abstract:
In one embodiment, non-transitory computer-readable medium stores instructions for implementing a file system, which include operations for acquiring an exclusive lock on a first node in an ordered tree data-structure, and adding an identifier and index of the first node to a path data structure. If the value of the index in the first node is non-zero, then each exclusive lock acquired between the first node and the root of the tree data structure is released. In any case, the operation proceeds to a second node, which is addressed at the index on the first node. In one embodiment, operations further include acquiring an exclusive lock on the second node, and, if the second node is a leaf node, performing updates to the second node, and then releasing each exclusive lock in the data-structure.
Abstract:
In some implementations, a B+tree (b plus tree) can provide concurrent access to data while modifying nodes of the B+tree. In some implementations, a top-down B+tree can be provided where nodes of the B+tree can be proactively merged, rebalanced and split to prevent recursive operations moving up the B+tree. In some implementations, node (or page) record data can be merged to consolidate record entries within nodes of the B+tree while only locking 1-3 nodes of the tree at the same time. In some implementations, record data can be merged across multiple nodes of the B+tree. In some implementations, ranges of data can be removed from the tree while only locking 1-3 nodes of the tree at the same time. In some implementations, range of data can be replaced with new data while only locking 1-3 nodes of the tree at the same time.
Abstract:
In one embodiment, two-phase mutation of an ordered tree data structure is performed, wherein a lock can be acquired on a first node in an ordered tree data structure, and an identifier for the first node can be added to a lock path data structure. A second node can also be locked, and an identifier for the second node can be added to the lock path data structure. Subsequently, a set of operations to perform on the ordered tree responsive to a modification of the second node can be determined for each node affected by the modification, and the operation for each node can be stored in the lock path data structure. Once the operations for the nodes have been determined, the operations listed in the lock path can be performed.
Abstract:
In one embodiment, the correlation filter can use one of several data structure to track each migration unit and reject successive accesses within a period of time to each migration unit. In one embodiment, the correlation filter uses a space efficient data structure, such as a hash indexed correlation array to store the address of referenced migration units, and to filter accesses to a single migration unit that are correlated accesses resulting from multiple accesses to the same migration unit during a sequential I/O stream. In one embodiment, the correlation array contains a global timeout, which resets each element to a default value, clearing all store migration unit address values from the correlation array. In one embodiment, each element of the migration array can time-out separately.
Abstract:
Methods and apparatuses that maintain birth time for a file system to optimize file update operations are described. The file system can include a plurality of snapshots or clones of data stored in one or more extents of blocks allocated in a storage device. Each extent may be associated with a time stamp according to the birth time. A request may be received from an executable using the file system to update data in a particular extent associated with a particular time stamp. In response, the current birth time in the file system and the particular time stamp may be compared to determine if the particular extent is not shared by more than one of the snapshots. If the particular time stamp is equal to the current birth time, the particular extent may be updated directly without performing an expensive operation to check whether a reference count of the particular extent is equal to one.
Abstract:
A non-overwrite storage system, such as a log-structured file system, that includes a non-volatile storage having multiple storage segments, a volatile storage having an unsafe free segments list (UFSL), and a controller for managing storage resources of the non-volatile storage. The controller can be configured to copy page data from used segment(s) of the non-volatile storage, write the copied page data to free segment(s) of the non-volatile storage, index the UFSL with indications of the used segment(s), and thereafter prevent reuse of the used segment(s) while the indications of the used segment(s) remain indexed in the UFSL. In some implementations, the non-overwrite storage system may be associated with flash storage system, and a flash controller can be configured perform a flush track cache operation to clear the indications of the used segment(s) from the UFSL, to enable reuse of segment(s) that were previously indexed to the UFSL.
Abstract:
In one embodiment, a new file creation cache is reserved on a fast storage device that is part of a composite storage device that also includes a slow storage device; the composite storage device is treated as a single logical volume (or a plurality of logical volumes) by a file system which maintains a mapping table that is used to determine whether the write operation is for a new file. If the write operation is for a new file, the file system attempts to write the new file to the fast storage device. If the write operation is not for a new file, the mapping table specifies which device is used for the write operation.
Abstract:
In one embodiment, a method for managing access to a fast non-volatile storage device, such as a solid state device, and a slower non-volatile storage device, such as a magnetic hard drive, can include a method of managing a sparse logical volume in which unmapped blocks of the logical volume are not allocated until use. In one embodiment, a method of sparse hole filling operates in which range locks are dynamically adjusted to perform allocations for sparse hole filling, and then re-adjusted to perform standard operations using a byte range lock. In one embodiment, a high level data structure can be used in the range lock service in the form of an ordered search tree, which could use any search tree algorithm, such as red-black tree, AVL tree, splay tree, etc.
Abstract:
Disclosed herein are techniques for encrypting data stored on a solid-state drive (SSD) managed by a system (e.g., a computing device). Specifically, the system is configured to track block units of a larger size on the SSD so that a mapping table associated with the SSD can be kept small. After running SSD encryption using the large size block units, the entire SSD can be fully encrypted without requiring clear text to be written onto the SSD subsequent to SSD encryption being activated. Thereafter, the entire SSD can be defragmented to produce a single physical extent of encrypted data.