Abstract:
A method for reducing write latency in a distributed file system. A write request that includes a volume identifier is received at a data management subsystem deployed on a node within a distributed storage system. The data management subsystem maps the volume identifier to a file system volume and maps the file system volume to a set of logical block addresses in a logical block device in a storage management subsystem deployed on the node. The storage management subsystem maps the logical block device to a metadata object for the logical block device on the node that is used to process the write request. The mapping of the file system volume to the set of logical block addresses in the logical block device enables co-locating the metadata object with the file system volume on the node, which reduces the write latency associated with processing the write request.
Abstract:
A method for reducing write latency in a distributed file system. A write request that includes a volume identifier is received at a data management subsystem deployed on a node within a distributed storage system. The data management subsystem maps the volume identifier to a file system volume and maps the file system volume to a set of logical block addresses in a logical block device in a storage management subsystem deployed on the node. The storage management subsystem maps the logical block device to a metadata object for the logical block device on the node that is used to process the write request. The mapping of the file system volume to the set of logical block addresses in the logical block device enables co-locating the metadata object with the file system volume on the node, which reduces the write latency associated with processing the write request.
Abstract:
Methods, non-transitory machine-readable media, and computing devices for transitioning tasks and interrupt service routines are provided. An example method includes processing, by a plurality of processor cores of a storage controller, tasks and interrupt service routines. A performance statistic is determined corresponding to the plurality of processor cores. Based on detecting that the performance statistic passes a threshold, a number of the plurality of processor cores that are assigned to the tasks and the interrupt service routines are reduced.
Abstract:
A system for dynamically configuring and scheduling input/output (I/O) workloads among processing cores is disclosed. Resources for an application that are related to each other and/or not multicore safe are grouped together into work nodes. When these need to be executed, the work nodes are added to a global queue that is accessible by all of the processing cores. Any processing core that becomes available can pull and process the next available work node through to completion, so that the work associated with that work node software object is all completed by the same core, without requiring additional protections for resources that are not multicore safe. Indexes track the location of both the next work node in the global queue for processing and the next location in the global queue for new work nodes to be added for subsequent processing.
Abstract:
A method for migration of operations between CPU cores, the method includes: processing, by a source core, one or more tasks and one or more interrupt service routines; accessing a mapping corresponding to a task of the one or more tasks and an interrupt service routine of the one or more interrupt service routines; identifying, based on the mapping, a target core that corresponds to the task and the interrupt service routine; blocking the task from being processed by the source core in response to identifying the target core; in response to identifying the target core, disabling an interrupt corresponding to the interrupt service routine; in response to identifying the target core, assigning the task and the interrupt to the target core; after assigning the interrupt to the target core, enabling the interrupt; and after assigning the task to the target core, processing the task by the target core.
Abstract:
Methods and systems for dynamic hashing in cache sub-systems are provided. The method includes analyzing a plurality of input/output (I/O) requests for determining a pattern indicating if the I/O requests are random or sequential; and using the pattern for dynamically changing a first input to a second input for computing a hash index value by a hashing function that is used to index into a hashing data structure to look up a cache block to cache an I/O request to read or write data, where for random I/O requests, a segment size is the first input to a hashing function to compute a first hash index value and for sequential I/O requests, a stripe size is used as the second input for computing a second hash index value.
Abstract:
In various examples, data storage is managed using a distributed storage management system that is resilient. Data blocks of a logical block device may be distributed across multiple nodes in a cluster. The logical block device may correspond to a file system volume associated with a file system instance deployed on a selected node within a distributed block layer of a distributed file system. Each data block may have a location in the cluster identified by a block identifier associated with each data block. Each data block may be replicated on at least one other node in the cluster. A metadata object corresponding to a logical block device that maps to the file system volume may be replicated on at least another node in the cluster. Each data block and the metadata object may be hosted on virtualized storage that is protected using redundant array independent disks (RAID).
Abstract:
Systems, devices, and methods are provided for sharing host resources in a multiprocessor storage array, the multiprocessor storage array running controller firmware designed for a uniprocessor environment. In some aspects, one or more virtual machines can be initialized by a virtual machine manager or a hypervisor in the storage array system. Each of the one or more virtual machines implement an instance of the controller firmware designed for a uniprocessor environment. The virtual machine manager or hypervisor can assign processing devices within the storage array system to each of the one or more virtual machines. The virtual machine manager or hypervisor can also assign virtual functions to each of the virtual machines. The virtual machines can concurrently access one or more I/O devices, such as physical storage devices, by writing to and reading from the respective virtual functions.
Abstract:
Methods and systems for managing caching mechanisms in storage systems are provided where a global cache management function manages multiple independent cache pools and a global cache pool. As an example, the method includes: splitting a cache storage into a plurality of independently operating cache pools, each cache pool comprising storage space for storing a plurality of cache blocks for storing data related to an input/output (“I/O”) request and metadata associated with each cache pool; receiving the I/O request for writing a data; operating a hash function on the I/O request to assign the I/O request to one of the plurality of cache pools; and writing the data of the I/O request to one or more of the cache blocks associated with the assigned cache pool. In an aspect, this allows efficient I/O processing across multiple processors simultaneously.
Abstract:
Systems and methods for scaling application and/or storage system functions of a distributed storage system based on a heterogeneous resource pool are provided. According to one embodiment, the distributed storage system has a composable, service-based architecture that provides scalability, resiliency, and load balancing. The distributed storage system includes a cluster of nodes each potentially having differing capabilities in terms of processing, memory, and/or storage. The distributed storage system takes advantage of different types of nodes by selectively instating appropriate services (e.g., file and volume services and/or block and storage management services) on the nodes based on their respective capabilities. Furthermore, disaggregation of these services, facilitated by interposing a frictionless layer (e.g., in the form of one or more globally accessible logical disks) therebetween, enables independent and on-demand scaling of either or both of application and storage system functions within the cluster while making use of the heterogeneous resource pool.