Adaptive task scheduling of Hadoop in a virtualized environment
    1.
    发明授权
    Adaptive task scheduling of Hadoop in a virtualized environment 有权
    Hadoop在虚拟化环境中的自适应任务调度

    公开(公告)号:US09183016B2

    公开(公告)日:2015-11-10

    申请号:US13778441

    申请日:2013-02-27

    申请人: VMware, Inc.

    IPC分类号: G06F9/455 G06F9/50

    摘要: A control module is introduced to communicate with an application workload scheduler of a distributed computing application, such as a Job Tracker node of a Hadoop cluster, and with the virtualized computing environment underlying the application. The control module periodically queries for resource consumption data, such as CPU utilization, and uses the data to calculate how MapReduce task slots should be allocated on each task node of the Hadoop cluster. The control module passes the task slot allocation to the application workload scheduler, which honors the allocation by adjusting task assignments to task nodes accordingly. The task nodes may also activate and deactivate task slots according to the changed slot allocation. As a result, the distributed computing application is able to scale up and down when other workloads sharing the virtualized computing environment change.

    摘要翻译: 引入控制模块以与分布式计算应用的应用工作负载调度器(例如Hadoop集群的作业跟踪器节点)以及应用程序的虚拟化计算环境进行通信。 控制模块定期查询资源消耗数据(如CPU利用率),并使用该数据计算如何在Hadoop集群的每个任务节点上分配MapReduce任务槽。 控制模块将任务时隙分配传递给应用程序工作负载调度程序,通过相应地调整任务分配来赋予分配权限。 任务节点还可以根据改变的时隙分配激活和去激活任务时隙。 因此,分布式计算应用程序能够在共享虚拟化计算环境的其他工作负载发生变化时上下放大。

    MULTIPATH LOAD BALANCING OPTIMIZATIONS FOR ALUA STORAGE SYSTEMS
    2.
    发明申请
    MULTIPATH LOAD BALANCING OPTIMIZATIONS FOR ALUA STORAGE SYSTEMS 有权
    ALUA存储系统的多路负载平衡优化

    公开(公告)号:US20140229638A1

    公开(公告)日:2014-08-14

    申请号:US13766605

    申请日:2013-02-13

    申请人: VMWARE, INC.

    IPC分类号: G06F3/06

    摘要: Techniques for performing I/O load balancing are provided. In one embodiment, a computer system can receive an I/O request destined for a storage array, where the computer system is communicatively coupled with the storage array via a plurality of paths, and where the plurality of paths include a set of optimized paths and a set of unoptimized paths. The computer system can further determine whether the I/O request can be transmitted to the storage array via either an optimized path or an unoptimized path, or solely via an optimized path. The computer system can then select a path in the plurality of paths based on the determination and transmit the I/O request to the storage array via the selected path.

    摘要翻译: 提供了执行I / O负载平衡的技术。 在一个实施例中,计算机系统可以接收去往存储阵列的I / O请求,其中计算机系统经由多个路径与存储阵列通信地耦合,并且其中多个路径包括一组优化的路径,以及 一套未优化的路径。 计算机系统可以进一步确定I / O请求是否可以经由优化的路径或未优化的路径,或者仅经由优化的路径发送到存储阵列。 计算机系统然后可以基于该确定来选择多个路径中的路径,并且经由所选择的路径将I / O请求发送到存储阵列。

    METHOD AND SYSTEM FOR VM-GRANULAR I/O CACHING
    3.
    发明申请
    METHOD AND SYSTEM FOR VM-GRANULAR I/O CACHING 有权
    用于VM-GRANULAR I / O缓存的方法和系统

    公开(公告)号:US20140115228A1

    公开(公告)日:2014-04-24

    申请号:US13658567

    申请日:2012-10-23

    申请人: VMWARE, INC.

    IPC分类号: G06F12/02

    摘要: Methods are presented for caching I/O data in a solid state drive (SSD) locally attached to a host computer supporting the running of a virtual machine (VM). Portions of the SSD are allocated as cache storage for VMs running on the host computer. A mapping relationship is maintained between unique identifiers for VMs running on the host computer and one or more process identifiers (PIDs) associated with processes running in the host computer that correspond to each of the VM's execution on the host computer. When an I/O request is received, a PID associated with I/O request is determined and a unique identifier for the VM is extracted from the mapping relationship based on the determined PID. A portion of the SSD corresponding to the unique identifier of the VM that is used as a cache for the VM can then be accessed in order to handle the I/O request.

    摘要翻译: 提出了缓存本地连接到支持虚拟机(VM)运行的主计算机的固态驱动器(SSD)中的I / O数据的方法。 SSD的部分被分配为用于在主机上运行的VM的高速缓存存储器。 维护在主计算机上运行的VM的唯一标识符之间的映射关系,以及与在主计算机上运行的每个虚拟机在主计算机上执行的主机相关联的进程相关联的一个或多个进程标识符(PID)。 当接收到I / O请求时,确定与I / O请求相关联的PID,并且基于所确定的PID从映射关系中提取用于VM的唯一标识符。 然后可以访问与用作VM的高速缓存的VM的唯一标识符相对应的SSD的一部分,以便处理I / O请求。

    THREAD CACHE ALLOCATION
    6.
    发明申请
    THREAD CACHE ALLOCATION 有权
    线程缓存分配

    公开(公告)号:US20150067262A1

    公开(公告)日:2015-03-05

    申请号:US14015784

    申请日:2013-08-30

    申请人: VMware, Inc.

    IPC分类号: G06F12/08

    摘要: Systems and techniques are described for thread cache allocation. A described technique includes monitoring input and output accesses for a plurality of threads executing on a computing device that includes a cache comprising a quantity of memory blocks, determining a respective reuse intensity for each of the threads, determining a respective read ratio for each of the threads, determining a respective quantity of memory blocks for each of the partitions by optimizing a combination of cache utilities, each cache utility being based on the respective reuse intensity, the respective read ratio, and a respective hit ratio for a particular partition, and resizing one or more of the partitions to be equal to the respective quantity of the memory blocks for the partition.

    摘要翻译: 系统和技术描述为线程高速缓存分配。 所描述的技术包括监视在计算设备上执行的多个线程的输入和输出访问,所述线程包括包含大量存储器块的高速缓存,为每个线程确定相应的重用强度,确定每个线程的相应读取比率 线程,通过优化高速缓存实用程序的组合来确定每个分区的相应数量的存储器块,每个高速缓存实用程序基于相应的重用强度,相应的读取比率以及特定分区的相应命中率,以及调整大小 一个或多个分区等于分区的存储块的相应数量。

    Techniques for Implementing Hybrid Flash/HDD-based Virtual Disk Files
    7.
    发明申请
    Techniques for Implementing Hybrid Flash/HDD-based Virtual Disk Files 有权
    实现基于混合闪存/ HDD的虚拟磁盘文件的技术

    公开(公告)号:US20150006788A1

    公开(公告)日:2015-01-01

    申请号:US13931409

    申请日:2013-06-28

    申请人: VMware, Inc.

    IPC分类号: G06F3/06

    摘要: Techniques for utilizing flash storage as an extension of hard disk (HDD) based storage are provided. In one embodiment, a computer system can store a first subset of blocks of a logical file in a first physical file residing on a flash storage tier, and a second subset of blocks of the logical file in a second physical file residing on an HDD storage tier. The computer system can then receive an I/O request directed to one or more blocks of the logical file and process the I/O request by accessing the flash storage tier or the HDD storage tier, the accessing being based on whether the one or more blocks are part of the first subset of blocks stored in the first physical file.

    摘要翻译: 提供了使用闪存作为基于硬盘(HDD)的存储的扩展的技术。 在一个实施例中,计算机系统可以将存储在闪存存储层上的第一物理文件中的逻辑文件的块的第一子集存储在驻留在HDD存储器上的第二物理文件中的逻辑文件块的第二子集 层。 计算机系统然后可以接收针对逻辑文件的一个或多个块的I / O请求,并通过访问闪存存储层或HDD存储层来处理该I / O请求,该访问基于该一个或多个 块是存储在第一个物理文件中的块的第一个子集的一部分。

    TECHNIQUES FOR DYNAMICALLY RELOCATING VIRTUAL DISK FILE BLOCKS BETWEEN FLASH STORAGE AND HDD-BASED STORAGE
    8.
    发明申请
    TECHNIQUES FOR DYNAMICALLY RELOCATING VIRTUAL DISK FILE BLOCKS BETWEEN FLASH STORAGE AND HDD-BASED STORAGE 有权
    用于动态存储和基于硬盘的存储之间的虚拟磁盘文件块的动态技术

    公开(公告)号:US20150006787A1

    公开(公告)日:2015-01-01

    申请号:US13931309

    申请日:2013-06-28

    申请人: VMware, Inc.

    IPC分类号: G06F3/06

    摘要: Techniques for dynamically managing the placement of blocks of a logical file between a flash storage tier and an HDD storage tier are provided. In one embodiment, a computer system can collect I/O statistics pertaining to the logical file, where a first subset of blocks of the logical file are stored on the flash storage tier and where a second subset of blocks of the logical file are stored on the HDD storage tier. The computer system can further generate a heat map for the logical file based on the I/O statistics, where the heat map indicates, for each block of the logical file, the number of times the block has been accessed. The computer system can then identify, using the heat map, one or more blocks of the logical file as being performance-critical blocks, and can move data between the flash and HDD storage tiers such that the performance-critical blocks are placed on the flash storage tier.

    摘要翻译: 提供了用于在闪存存储层和HDD存储层之间动态管理逻辑文件块的位置的技术。 在一个实施例中,计算机系统可以收集与逻辑文件相关的I / O统计信息,其中逻辑文件的块的第一子集存储在闪存存储层上,并且逻辑文件的块的第二子集存储在 HDD存储层。 计算机系统可以基于I / O统计信息进一步生成用于逻辑文件的热图,其中热图针对逻辑文件的每个块指示块已经被访问的次数。 计算机系统然后可以使用热图将逻辑文件的一个或多个块识别为性能关键块,并且可以在闪存和HDD存储层之间移动数据,使得性能关键块被放置在闪存上 存储层。

    METHOD AND SYSTEM FOR VM-GRANULAR SSD/FLASH CACHE LIVE MIGRATION
    9.
    发明申请
    METHOD AND SYSTEM FOR VM-GRANULAR SSD/FLASH CACHE LIVE MIGRATION 有权
    用于VM-GRANULAR SSD / FLASH CACHE LIVE MIGRATION的方法和系统

    公开(公告)号:US20140297780A1

    公开(公告)日:2014-10-02

    申请号:US13850985

    申请日:2013-03-26

    申请人: VMware, Inc.

    IPC分类号: H04L29/08

    CPC分类号: H04L67/2847

    摘要: The instant disclosure describes embodiments of a system and method for migrating virtual machine (VM)-specific content cached in a solid state drive (SSD) attached to an original host. During operation, the original host receives event indicating an upcoming migration of a VM to a destination host. In response, the original host transmits a set of metadata associated with the SSD cache to the destination host. The metadata indicates a number of data blocks stored in the SSD cache, thereby allowing the destination host to pre-fetch data blocks specified in the metadata from a storage shared by the original host and the destination host. Subsequently, the original host receives a power-off event for the VM, and transmits a dirty block list to the destination. The dirty block list specifies one or more data blocks that have changed since the transmission of the metadata.

    摘要翻译: 本公开描述用于迁移在连接到原始主机的固态驱动器(SSD)中缓存的虚拟机(VM)特定内容的系统和方法的实施例。 在操作期间,原始主机接收到指示即将到达目的地主机的VM迁移的事件。 作为响应,原始主机将与SSD高速缓存相关联的一组元数据发送到目的地主机。 元数据指示存储在SSD高速缓存中的多个数据块,从而允许目的地主机从由原始主机和目的地主机共享的存储器中预取在元数据中指定的数据块。 随后,原始主机接收到VM的电源关闭事件,并将脏块列表发送到目的地。 脏块列表指定自元数据传输以来已经改变的一个或多个数据块。

    Distributed shared log storage system having an adapter for heterogenous big data workloads

    公开(公告)号:US10540119B2

    公开(公告)日:2020-01-21

    申请号:US15403015

    申请日:2017-01-10

    申请人: VMware, Inc.

    IPC分类号: G06F3/06

    摘要: A distributed shared log storage system employs an adapter that translates Application Programming Interfaces (APIs) for a big data application to APIs of the distributed shared log storage system. An instance of an adapter is configured for different big data applications in accordance with a profile thereof, so that the big data applications can take on a variety of added characteristics to enhance the application and/or to improve the performance of the application. Included in the added characteristics are global or local ordering of operations, replication of operations according to different replication models, making the operations atomic and caching.