DISTRIBUTED SHARED LOG STORAGE SYSTEM HAVING AN ADAPTER FOR HETEROGENOUS BIG DATA WORKLOADS

    公开(公告)号:US20180060143A1

    公开(公告)日:2018-03-01

    申请号:US15403015

    申请日:2017-01-10

    Applicant: VMware, Inc.

    CPC classification number: G06F3/067 G06F16/00 G06F16/27

    Abstract: A distributed shared log storage system employs an adapter that translates APIs for a big data application to APIs of the distributed shared log storage system. An instance of an adapter is configured for different big data applications in accordance with a profile thereof, so that the big data applications can take on a variety of added characteristics to enhance the application and/or to improve the performance of the application. Included in the added characteristics are global or local ordering of operations, replication of operations according to different replication models, making the operations atomic and caching.

    Adaptive Task Scheduling of Hadoop in a Virtualized Environment
    3.
    发明申请
    Adaptive Task Scheduling of Hadoop in a Virtualized Environment 有权
    虚拟化环境中Hadoop的自适应任务调度

    公开(公告)号:US20140245298A1

    公开(公告)日:2014-08-28

    申请号:US13778441

    申请日:2013-02-27

    Applicant: VMWARE, INC.

    Abstract: A control module is introduced to communicate with an application workload scheduler of a distributed computing application, such as a Job Tracker node of a Hadoop cluster, and with the virtualized computing environment underlying the application. The control module periodically queries for resource consumption data, such as CPU utilization, and uses the data to calculate how MapReduce task slots should be allocated on each task node of the Hadoop cluster. The control module passes the task slot allocation to the application workload scheduler, which honors the allocation by adjusting task assignments to task nodes accordingly. The task nodes may also activate and deactivate task slots according to the changed slot allocation. As a result, the distributed computing application is able to scale up and down when other workloads sharing the virtualized computing environment change.

    Abstract translation: 引入控制模块以与分布式计算应用的应用工作负载调度器(例如Hadoop集群的作业跟踪器节点)以及应用程序的虚拟化计算环境进行通信。 控制模块定期查询资源消耗数据(如CPU利用率),并使用该数据计算如何在Hadoop集群的每个任务节点上分配MapReduce任务时隙。 控制模块将任务时隙分配传递给应用程序工作负载调度程序,通过相应地调整任务分配来赋予分配权限。 任务节点还可以根据改变的时隙分配激活和去激活任务时隙。 因此,分布式计算应用程序能够在共享虚拟化计算环境的其他工作负载发生变化时上下放大。

Patent Agency Ranking