MULTI-STAGE PIPELINING FOR DISTRIBUTED GRAPH PROCESSING

    公开(公告)号:US20190342372A1

    公开(公告)日:2019-11-07

    申请号:US15968637

    申请日:2018-05-01

    Abstract: Techniques are described herein for evaluating graph processing tasks using a multi-stage pipelining communication mechanism. In a multi-node system comprising a plurality of nodes, each node of said plurality of nodes executing a respective communication agent object, wherein said respective communication agent object comprises: a sender lambda function is configured to: perform one or more sending operations, generate source messages based on the one or more sender operations, each source message of said source messages being marked for a particular node of said plurality of nodes. An intermediate lambda function is configured to: read source messages marked for said each node and sent to said each node, perform one or more intermediate operations based on the one or more source messages, generate intermediate messages based on the one or more intermediate operations, each intermediate message of said intermediate messages being marked for a particular node of said plurality of nodes. A final receiver lambda function configured to: read intermediate messages marked for said each node and sent to said each node, perform one or more final operations based on the one or more intermediate messages, generate a final result based on the one or more final operations. On each node of said plurality of nodes, the communication agent object is executed, wherein the communication agent object comprises executing said sender lambda function, said intermediate lambda function, and said final receiver lambda function.

    Distributed graph processing system that support remote data read with proactive bulk data transfer

    公开(公告)号:US10459978B2

    公开(公告)日:2019-10-29

    申请号:US14678358

    申请日:2015-04-03

    Abstract: Techniques for generating and transferring bulk messages from one computing device to another computing device in a cluster are provided. Each computing device in a cluster is assigned a different set of nodes of a graph. A first computing device may be assigned a particular node that is neighbors with multiple other nodes that are assigned to one or more other computing devices in the cluster. When processing graph-related code at the first computing device, information about the neighbors may be required. The first computing device receives a bulk message from one of the other computing devices. The bulk message contains information about at least a subset of the neighbors. Therefore, the first computing device is not required to send multiple messages for information about the subset of neighbors. In fact, the first computing device is not required to send any message for the information.

    Concurrent distributed graph processing system with self-balance

    公开(公告)号:US10275287B2

    公开(公告)日:2019-04-30

    申请号:US15175920

    申请日:2016-06-07

    Abstract: Techniques are provided for dynamically self-balancing communication and computation. In an embodiment, each partition of application data is stored on a respective computer of a cluster. The application is divided into distributed jobs, each of which corresponds to a partition. Each distributed job is hosted on the computer that hosts the corresponding data partition. Each computer divides its distributed job into computation tasks. Each computer has a pool of threads that execute the computation tasks. During execution, one computer receives a data access request from another computer. The data access request is executed by a thread of the pool. Threads of the pool are bimodal and may be repurposed between communication and computation, depending on workload. Each computer individually detects completion of its computation tasks. Each computer informs a central computer that its distributed job has finished. The central computer detects when all distributed jobs of the application have terminated.

    DISTRIBUTED GRAPH PROCESSING SYSTEM FEATURING INTERACTIVE REMOTE CONTROL MECHANISM INCLUDING TASK CANCELLATION

    公开(公告)号:US20180210761A1

    公开(公告)日:2018-07-26

    申请号:US15413811

    申请日:2017-01-24

    CPC classification number: G06F9/5066

    Abstract: Techniques herein provide job control and synchronization of distributed graph-processing jobs. In an embodiment, a computer system maintains an input queue of graph processing jobs. In response to de-queuing a graph processing job, a master thread partitions the graph processing job into distributed jobs. Each distributed job has a sequence of processing phases. The master thread sends each distributed job to a distributed processor. Each distributed job executes a first processing phase of its sequence of processing phases. To the master thread, the distributed job announces completion of its first processing phase. The master thread detects that all distributed jobs have announced finishing their first processing phase. The master thread broadcasts a notification to the distributed jobs that indicates that all distributed jobs have finished their first processing phase. Receiving that notification causes the distributed jobs to execute their second processing phase. Queues and barriers provide for faults and cancellation.

    Distributed graph processing system featuring interactive remote control mechanism including task cancellation

    公开(公告)号:US10318355B2

    公开(公告)日:2019-06-11

    申请号:US15413811

    申请日:2017-01-24

    Abstract: Techniques herein provide job control and synchronization of distributed graph-processing jobs. In an embodiment, a computer system maintains an input queue of graph processing jobs. In response to de-queuing a graph processing job, a master thread partitions the graph processing job into distributed jobs. Each distributed job has a sequence of processing phases. The master thread sends each distributed job to a distributed processor. Each distributed job executes a first processing phase of its sequence of processing phases. To the master thread, the distributed job announces completion of its first processing phase. The master thread detects that all distributed jobs have announced finishing their first processing phase. The master thread broadcasts a notification to the distributed jobs that indicates that all distributed jobs have finished their first processing phase. Receiving that notification causes the distributed jobs to execute their second processing phase. Queues and barriers provide for faults and cancellation.

    Graph-data partitioning for workload-balanced distributed computation with cost estimation functions
    20.
    发明授权
    Graph-data partitioning for workload-balanced distributed computation with cost estimation functions 有权
    用于具有成本估算功能的工作负载均衡分布式计算的图形数据分区

    公开(公告)号:US09477532B1

    公开(公告)日:2016-10-25

    申请号:US14876075

    申请日:2015-10-06

    CPC classification number: G06F9/5083 G06F9/4881 G06F2209/5022

    Abstract: Techniques herein perform workload-balanced graph partitioning. Each graph partition is distributed to a respective computer. Each computer applies a workload-estimation function to its partition to calculate a numeric workload-value that indicates how much computation the partition needs. Each computer sends its numeric workload-value to a master computer. The master compares the highest and lowest numeric workload-values. If the difference exceeds a threshold, the master detects how much work should overloaded-computers offload to under-utilized computers. To each overloaded-computer, the master sends a directive with a balancing numeric workload-value that indicates how much computation to offload and an identifier of an under-utilized computer to receive the offload. Based on this directive and the workload-estimation function, an overloaded-computer selects a portion of its partition that corresponds to the balancing numeric workload-value, removes that portion from its partition, and transfers the portion to the under-utilized computer, which adds the portion to its partition.

    Abstract translation: 这里的技术执行工作负载平衡图分割。 每个图形分区都分配给相应的计算机。 每个计算机将其工作负载估计功能应用于其分区,以计算数字工作负载值,该值指示分区需要多少计算。 每个计算机将其数字工作负载值发送到主计算机。 主人比较最高和最低数值工作负载值。 如果差异超过阈值,则主机检测到应该超载多少工作 - 计算机卸载到未充分利用的计算机。 对于每台重载计算机,主机发送一个指令,其中包含一个平衡数字工作负载值,指示卸载多少计算和一个未充分利用的计算机的标识符来接收卸载。 基于该指令和工作负载估计功能,重载计算机选择其对应于平衡数字工作负载值的分区的一部分,从其分区中移除该部分,并将该部分传送到未充分利用的计算机,其中 将该部分添加到其分区。

Patent Agency Ranking