-
公开(公告)号:US09596295B2
公开(公告)日:2017-03-14
申请号:US14143894
申请日:2013-12-30
Applicant: GOOGLE INC.
Inventor: Seyed Vahab Mirrokni Banadaki , Raimondas Kiveris , Vibhor Rastogi , Silvio Lattanzi , Sergei Vassilvitskii
CPC classification number: H04L67/10 , G06F9/5066 , G06F9/546 , G06Q10/06
Abstract: Systems and methods for improving the time and cost to calculate connected components in a distributed graph are disclosed. One method includes reducing a quantity of map-reduce rounds used to determine a cluster assignment for a node in a large distributed graph by alternating between two hashing functions in the map stage of a map-reduce round and storing the cluster assignment for the node in a memory. Another method includes reducing a quantity of messages sent during map-reduce rounds by performing a predetermined quantity of rounds to generate, for each node, a set of potential cluster assignments, generating a data structure in memory to store a mapping between each node and its potential cluster assignment, and using the data structure during remaining map-reduce rounds, wherein the remaining map-reduce rounds do not send messages between nodes. The method can also include storing the cluster assignment for the node in a memory.
Abstract translation: 公开了用于改善计算分布图中的连接部件的时间和成本的系统和方法。 一种方法包括通过在map-reduce round的映射阶段中的两个散列函数之间交替并且将节点的集群分配存储在减少用于确定大分布式图中的节点的集群分配的地图缩小轮数量 一个记忆 另一种方法包括通过执行预定量的轮次来减少在地图缩小轮期间发送的消息的数量,以为每个节点生成一组潜在的分组分配,在存储器中生成数据结构以存储每个节点与其之间的映射 潜在的集群分配,以及在剩余的映射缩小循环期间使用数据结构,其中剩余的映射缩小轮不在节点之间发送消息。 该方法还可以包括将节点的集群分配存储在存储器中。
-
公开(公告)号:US20150006619A1
公开(公告)日:2015-01-01
申请号:US14143894
申请日:2013-12-30
Applicant: GOOGLE INC.
Inventor: Seyed Vahab Mirrokni Banadaki , Raimondas Kiveris , Vibhor Rastogi , Silvio Lattanzi , Sergei Vassilvitskii
IPC: H04L29/08
CPC classification number: H04L67/10 , G06F9/5066 , G06F9/546 , G06Q10/06
Abstract: Systems and methods for improving the time and cost to calculate connected components in a distributed graph are disclosed. One method includes reducing a quantity of map-reduce rounds used to determine a cluster assignment for a node in a large distributed graph by alternating between two hashing functions in the map stage of a map-reduce round and storing the cluster assignment for the node in a memory. Another method includes reducing a quantity of messages sent during map-reduce rounds by performing a predetermined quantity of rounds to generate, for each node, a set of potential cluster assignments, generating a data structure in memory to store a mapping between each node and its potential cluster assignment, and using the data structure during remaining map-reduce rounds, wherein the remaining map-reduce rounds do not send messages between nodes. The method can also include storing the cluster assignment for the node in a memory.
Abstract translation: 公开了用于改善计算分布图中的连接部件的时间和成本的系统和方法。 一种方法包括通过在map-reduce round的映射阶段中的两个散列函数之间交替并且将节点的集群分配存储在减少用于确定大分布式图中的节点的集群分配的地图缩小轮数量 一个记忆 另一种方法包括通过执行预定量的轮次来减少在地图缩小轮期间发送的消息的数量,以为每个节点生成一组潜在的分组分配,在存储器中生成数据结构以存储每个节点与其之间的映射 潜在的集群分配,以及在剩余的映射缩小循环期间使用数据结构,其中剩余的映射缩小轮不在节点之间发送消息。 该方法还可以包括将节点的集群分配存储在存储器中。
-