摘要:
A control module is introduced to communicate with an application workload scheduler of a distributed computing application, such as a Job Tracker node of a Hadoop cluster, and with the virtualized computing environment underlying the application. The control module periodically queries for resource consumption data, such as CPU utilization, and uses the data to calculate how MapReduce task slots should be allocated on each task node of the Hadoop cluster. The control module passes the task slot allocation to the application workload scheduler, which honors the allocation by adjusting task assignments to task nodes accordingly. The task nodes may also activate and deactivate task slots according to the changed slot allocation. As a result, the distributed computing application is able to scale up and down when other workloads sharing the virtualized computing environment change.
摘要:
Techniques for performing I/O load balancing are provided. In one embodiment, a computer system can receive an I/O request destined for a storage array, where the computer system is communicatively coupled with the storage array via a plurality of paths, and where the plurality of paths include a set of optimized paths and a set of unoptimized paths. The computer system can further determine whether the I/O request can be transmitted to the storage array via either an optimized path or an unoptimized path, or solely via an optimized path. The computer system can then select a path in the plurality of paths based on the determination and transmit the I/O request to the storage array via the selected path.
摘要:
Methods are presented for caching I/O data in a solid state drive (SSD) locally attached to a host computer supporting the running of a virtual machine (VM). Portions of the SSD are allocated as cache storage for VMs running on the host computer. A mapping relationship is maintained between unique identifiers for VMs running on the host computer and one or more process identifiers (PIDs) associated with processes running in the host computer that correspond to each of the VM's execution on the host computer. When an I/O request is received, a PID associated with I/O request is determined and a unique identifier for the VM is extracted from the mapping relationship based on the determined PID. A portion of the SSD corresponding to the unique identifier of the VM that is used as a cache for the VM can then be accessed in order to handle the I/O request.
摘要:
Techniques for utilizing flash storage as an extension of hard disk (HDD) storage are provided. In one embodiment, a computer system stores a subset of blocks of a logical file in a first physical file, associated with a first data structure that represents a filesystem object, on flash storage and a subset of blocks, associated with a second data structure that represents a filesystem object comprising tiering configuration information that includes an identifier of the first physical file, in a second physical file on HDD storage. The computer system processes an I/O request directed to the logical file by directing it to either the physical file on the flash storage or the HDD storage by verifying that the tiering configuration information exists in the data structure and determining whether the one or more blocks are part of the first subset of blocks or the second subset of blocks.
摘要:
Techniques for automatically allocating space in a flash storage-based cache are provided. In one embodiment, a computer system collects I/O trace logs for a plurality of virtual machines or a plurality of virtual disks and determines cache utility models for the plurality of virtual machines or the plurality of virtual disks based on the I/O trace logs. The cache utility model for each virtual machine or each virtual disk defines an expected utility of allocating space in the flash storage-based cache to the virtual machine or the virtual disk over a range of different cache allocation sizes. The computer system then calculates target cache allocation sizes for the plurality of virtual machines or the plurality of virtual disks based on the cache utility models and allocates space in the flash storage-based cache based on the target cache allocation sizes.
摘要:
Systems and techniques are described for thread cache allocation. A described technique includes monitoring input and output accesses for a plurality of threads executing on a computing device that includes a cache comprising a quantity of memory blocks, determining a respective reuse intensity for each of the threads, determining a respective read ratio for each of the threads, determining a respective quantity of memory blocks for each of the partitions by optimizing a combination of cache utilities, each cache utility being based on the respective reuse intensity, the respective read ratio, and a respective hit ratio for a particular partition, and resizing one or more of the partitions to be equal to the respective quantity of the memory blocks for the partition.
摘要:
Techniques for utilizing flash storage as an extension of hard disk (HDD) based storage are provided. In one embodiment, a computer system can store a first subset of blocks of a logical file in a first physical file residing on a flash storage tier, and a second subset of blocks of the logical file in a second physical file residing on an HDD storage tier. The computer system can then receive an I/O request directed to one or more blocks of the logical file and process the I/O request by accessing the flash storage tier or the HDD storage tier, the accessing being based on whether the one or more blocks are part of the first subset of blocks stored in the first physical file.
摘要:
Techniques for dynamically managing the placement of blocks of a logical file between a flash storage tier and an HDD storage tier are provided. In one embodiment, a computer system can collect I/O statistics pertaining to the logical file, where a first subset of blocks of the logical file are stored on the flash storage tier and where a second subset of blocks of the logical file are stored on the HDD storage tier. The computer system can further generate a heat map for the logical file based on the I/O statistics, where the heat map indicates, for each block of the logical file, the number of times the block has been accessed. The computer system can then identify, using the heat map, one or more blocks of the logical file as being performance-critical blocks, and can move data between the flash and HDD storage tiers such that the performance-critical blocks are placed on the flash storage tier.
摘要:
The instant disclosure describes embodiments of a system and method for migrating virtual machine (VM)-specific content cached in a solid state drive (SSD) attached to an original host. During operation, the original host receives event indicating an upcoming migration of a VM to a destination host. In response, the original host transmits a set of metadata associated with the SSD cache to the destination host. The metadata indicates a number of data blocks stored in the SSD cache, thereby allowing the destination host to pre-fetch data blocks specified in the metadata from a storage shared by the original host and the destination host. Subsequently, the original host receives a power-off event for the VM, and transmits a dirty block list to the destination. The dirty block list specifies one or more data blocks that have changed since the transmission of the metadata.
摘要:
A distributed shared log storage system employs an adapter that translates Application Programming Interfaces (APIs) for a big data application to APIs of the distributed shared log storage system. An instance of an adapter is configured for different big data applications in accordance with a profile thereof, so that the big data applications can take on a variety of added characteristics to enhance the application and/or to improve the performance of the application. Included in the added characteristics are global or local ordering of operations, replication of operations according to different replication models, making the operations atomic and caching.