Abstract:
An example method of handling, at a hypervisor on a host in a virtualized computing system, a write input/output (IO) operation to a file on a storage device having a virtual machine file system (VMFS) is described. The method includes: sorting, at the hypervisor, a scatter-gather array for the write IO operation into sets of scatter-gather elements, each of the sets including at least one scatter-gather element targeting a common file block address; resolving offsets of the sets of scatter-gather elements to identify a first scatter-gather array of transaction-dependent scatter-gather elements; generating logical transactions for the first scatter-gather array having updates to metadata of the VMFS for the file; batching the logical transactions into a physical transaction; and executing the physical transaction to commit the updates to the metadata of the VMFS on the storage device for the file.
Abstract:
The disclosure herein describes managing a rate of processing unmap requests for a data storage volume. Unmap requests are received from a cluster of active hosts that are associated with the data storage volume. Latency data values of each active host are then accessed. A long-term cluster latency average value is calculated based on the accessed latency data values of all active hosts over a long-term time period and a short-term cluster latency average value is calculated based on the accessed latency data values of all active hosts over a short-term time period. An unmap rate adjustment value is calculated based on a difference between the long-term cluster latency average value and the short-term cluster latency average value. The rate of processing unmap requests for the data storage volume is adjusted based on the unmap rate adjustment value and the unmap requests are performed based on the adjusted rate.
Abstract:
Techniques for decoupling the commit and replay of file system metadata updates in a clustered file system (CFS) are provided. In one embodiment, a CFS layer of a computer system can receive a file I/O operation from a client application, where the file I/O operation involves an update to a file system metadata resource maintained on persistent storage. In response, a journaling component of the CFS layer can execute a commit phase for committing the update to a journal on the persistent storage. The CFS layer can then return an acknowledgment to the client application indicating that the file I/O operation is complete, where the acknowledgement is returned prior to completion of a replay phase configured to propagate the update from the journal to one or more locations on the persistent storage where the file system metadata resource is actually stored.
Abstract:
An example method of handling, at a hypervisor on a host in a virtualized computing system, a write input/output (IO) operation to a file on a storage device having a virtual machine file system (VMFS) is described. The method includes: generating logical transactions for the write IO operation having updates to metadata of the VMFS for the file; estimating, for the logical transactions, common space reservations for those of the updates to common fields in the metadata for the file; estimating, for the logical transactions, exclusive space reservations for those of the updates to exclusive fields in the metadata for the file; batching the logical transactions into a physical transaction, which includes a single reservation of space in a journal of the VMFS based on the common space reservations and a reservations of space in the journal for each of the exclusive space reservations, respectively.
Abstract:
A method of synchronously executing input/output operations (IOs) for a plurality of applications using a storage device with a file system includes the steps of: receiving a first write IO including an instruction to write first data at a first address of the file system; determining that, within a first range of the file system comprising the first address, there are no pending unmap IOs for deallocating storage space of the storage device from files of the plurality of applications; after determining that there are no pending unmap IOs within the first range, locking the first range to prevent incoming unmap IOs from deallocating storage space within the first range from the files of the plurality of applications; after locking the first range, writing the first data to the storage device at the first address; and after writing the first data, unlocking the first range.
Abstract:
A distributed file system may be configured with file blocks of a first type and file blocks of a second type, from allocation units that comprise a logical volume containing the file system. File blocks of the second type may be defined from one or more file blocks of the first type. A thick file may be instantiated with a number of allocation units totaling a size greater than or equal to a specified file size of the thick file. The allocation units may be allocated to the thick file in units of file blocks of the first type or file blocks of the second type, depending on the specified file size of the thick file.
Abstract:
The systems described herein are configured to enhance the efficiency of memory in a host file system with respect to hosted virtual file systems. In situations when the hosted virtual file systems use smaller file block sizes than the file block sizes of the host file system. During storage of a file, a file block is assigned a block address and unmapping bits. The block address and unmapping bits are stored in a pointer block or other similar data structure associated with the file. Particularly, the block address is stored in a first address block and the unmapping bits are stored in at least one additional address block located in proximity to the block address, such that the unmap granularity of the file is not limited by the fixed size of address blocks in the system.
Abstract:
A method for coalescing IO requests includes maintaining a queue in a layer of an IO stack of a hypervisor, wherein (i) the queue holds IO requests received from an upper layer of the IO stack without forwarding the IO requests down the IO stack, and (ii) the layer of the IO stack resides above a file system layer of the IO stack. The method further includes receiving, at the layer, an IO request from the upper layer or a notification of a completion of certain IO requests previously transmitted by the layer down the IO stack. The method further includes determining whether any IO requests currently held in the queue should be transmitted down the IO stack based upon a condition; and combining any IO requests in the queue into at least one combined IO request to transmit down the IO stack if the condition is satisfied.
Abstract:
In a computer system having virtual machines running therein, a hypervisor that supports execution of the virtual machines allocates blocks of storage to the virtual machines from a thinly provisioned logical block device. When the hypervisor deletes a file or receives commands to delete a file, the hypervisor moves the file into a delete directory. An unmap thread running in the background issues unmap commands to the storage device to release one or more blocks of the logical block device that are allocated to the files in the delete directory, so that the unmap operation can be executed asynchronously with respect to the file delete event.
Abstract:
A method of deleting a first pointer block of a plurality of pointer blocks of a file system from a storage device used by a plurality of applications, wherein the plurality of pointer blocks are each subdivided into sub-blocks, includes the steps of: determining that a first sub-block of the first pointer block is marked as being empty of any addresses of the file system at which storage space is allocated to files of the applications, determining that a second sub-block of the first pointer block has not been marked as being empty; in response to the determining that the second sub-block has not been marked as being empty, determining that the second sub-block does not contain any addresses of the file system at which storage space is allocated to the files of the applications; and deleting the first pointer block from the storage device.