-
公开(公告)号:US20190342372A1
公开(公告)日:2019-11-07
申请号:US15968637
申请日:2018-05-01
Applicant: Oracle International Corporation
Inventor: Jinsu Lee , Thomas Manhardt , Sungpack Hong , Petr Koupy , Hassan Chafi , Vasileios Trigonakis
Abstract: Techniques are described herein for evaluating graph processing tasks using a multi-stage pipelining communication mechanism. In a multi-node system comprising a plurality of nodes, each node of said plurality of nodes executing a respective communication agent object, wherein said respective communication agent object comprises: a sender lambda function is configured to: perform one or more sending operations, generate source messages based on the one or more sender operations, each source message of said source messages being marked for a particular node of said plurality of nodes. An intermediate lambda function is configured to: read source messages marked for said each node and sent to said each node, perform one or more intermediate operations based on the one or more source messages, generate intermediate messages based on the one or more intermediate operations, each intermediate message of said intermediate messages being marked for a particular node of said plurality of nodes. A final receiver lambda function configured to: read intermediate messages marked for said each node and sent to said each node, perform one or more final operations based on the one or more intermediate messages, generate a final result based on the one or more final operations. On each node of said plurality of nodes, the communication agent object is executed, wherein the communication agent object comprises executing said sender lambda function, said intermediate lambda function, and said final receiver lambda function.
-
公开(公告)号:US10275287B2
公开(公告)日:2019-04-30
申请号:US15175920
申请日:2016-06-07
Applicant: Oracle International Corporation
Inventor: Thomas Manhardt , Sungpack Hong , Siegfried Depner , Jinsu Lee , Nicholas Roth , Hassan Chafi
Abstract: Techniques are provided for dynamically self-balancing communication and computation. In an embodiment, each partition of application data is stored on a respective computer of a cluster. The application is divided into distributed jobs, each of which corresponds to a partition. Each distributed job is hosted on the computer that hosts the corresponding data partition. Each computer divides its distributed job into computation tasks. Each computer has a pool of threads that execute the computation tasks. During execution, one computer receives a data access request from another computer. The data access request is executed by a thread of the pool. Threads of the pool are bimodal and may be repurposed between communication and computation, depending on workload. Each computer individually detects completion of its computation tasks. Each computer informs a central computer that its distributed job has finished. The central computer detects when all distributed jobs of the application have terminated.
-
13.
公开(公告)号:US20180210761A1
公开(公告)日:2018-07-26
申请号:US15413811
申请日:2017-01-24
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Jinsu Lee , Sungpack Hong , Siegfried Depner , Nicholas Roth , Thomas Manhardt , Hassan Chafi
CPC classification number: G06F9/5066
Abstract: Techniques herein provide job control and synchronization of distributed graph-processing jobs. In an embodiment, a computer system maintains an input queue of graph processing jobs. In response to de-queuing a graph processing job, a master thread partitions the graph processing job into distributed jobs. Each distributed job has a sequence of processing phases. The master thread sends each distributed job to a distributed processor. Each distributed job executes a first processing phase of its sequence of processing phases. To the master thread, the distributed job announces completion of its first processing phase. The master thread detects that all distributed jobs have announced finishing their first processing phase. The master thread broadcasts a notification to the distributed jobs that indicates that all distributed jobs have finished their first processing phase. Receiving that notification causes the distributed jobs to execute their second processing phase. Queues and barriers provide for faults and cancellation.
-
14.
公开(公告)号:US11907255B2
公开(公告)日:2024-02-20
申请号:US17686938
申请日:2022-03-04
Applicant: Oracle International Corporation
Inventor: Jinsu Lee , Petr Koupy , Vasileios Trigonakis , Sungpack Hong , Hassan Chafi
CPC classification number: G06F16/27 , G06F16/2282 , G06F16/284
Abstract: In an embodiment, multiple computers cooperate to retrieve content from tables in a relational database. Each table contains respective rows. Each row contains a vertex of a graph. Many high-degree vertices are identified. Each high-degree vertex is connected to respective edges in the graph. A count of the edges of each high-degree vertex exceeds a degree threshold. A central computer detects that all vertices in a high-degree subset of tables are high-degree vertices. Based on detecting the high-degree subset of tables, multiple vertices of the graph that are not in the high-degree subset of tables are replicated. Within local storage capacity limits of the computers, this degree-based replication may be supplemented with other vertex replication strategies that are schema based, content based, or workload based. This intelligent selective replication maximizes system throughput by minimizing graph data access latency based on data locality.
-
15.
公开(公告)号:US20230281219A1
公开(公告)日:2023-09-07
申请号:US17686938
申请日:2022-03-04
Applicant: Oracle International Corporation
Inventor: Jinsu Lee , Petr Koupy , Vasileios Trigonakis , Sungpack Hong , Hassan Chafi
CPC classification number: G06F16/27 , G06F16/284 , G06F16/2282
Abstract: In an embodiment, multiple computers cooperate to retrieve content from tables in a relational database. Each table contains respective rows. Each row contains a vertex of a graph. Many high-degree vertices are identified. Each high-degree vertex is connected to respective edges in the graph. A count of the edges of each high-degree vertex exceeds a degree threshold. A central computer detects that all vertices in a high-degree subset of tables are high-degree vertices. Based on detecting the high-degree subset of tables, multiple vertices of the graph that are not in the high-degree subset of tables are replicated. Within local storage capacity limits of the computers, this degree-based replication may be supplemented with other vertex replication strategies that are schema based, content based, or workload based. This intelligent selective replication maximizes system throughput by minimizing graph data access latency based on data locality.
-
公开(公告)号:US20230237047A1
公开(公告)日:2023-07-27
申请号:US17585117
申请日:2022-01-26
Applicant: Oracle International Corporation
Inventor: Vasileios Trigonakis , Paul Renauld , Jinsu Lee , Petr Koupy , Sungpack Hong , Hassan Chafi
IPC: G06F16/23 , G06F16/901
CPC classification number: G06F16/2379 , G06F16/9024
Abstract: Data structures and methods are described for applying mutations on a distributed graph in a fast and memory-efficient manner. Nodes in a distributed graph processing system may store graph information such as vertices, edges, properties, vertex keys, vertex degree counts, and other information in graph arrays, which are divided into shared arrays and delta logs. The shared arrays on a local node remain immutable and are the starting point of a graph, on top of which mutations build new snapshots. Mutations may be supported at both the entity and table levels. Periodic delta log consolidation may occur at multiple levels to prevent excessive delta log buildup. Consolidation at the table level may also trigger rebalancing of vertices across the nodes.
-
公开(公告)号:US10534657B2
公开(公告)日:2020-01-14
申请号:US15607985
申请日:2017-05-30
Applicant: Oracle International Corporation
Inventor: Siegfried Depner , Sungpack Hong , Thomas Manhardt , Jinsu Lee , Nicholas Roth , Hassan Chafi
Abstract: Techniques minimize communication while loading a graph. In a distributed embodiment, each computer loads some edges of the graph. Each edge connects a source vertex (SV) to a destination vertex. For each SV of the edges, the computer hashes the SV to detect a tracking computer (TrC) that tracks on which computer does the SV reside. Each computer informs the TrC that the SV originates an edge that resides on that computer. For each SV, the TrC detects that the SV originates edges that reside on multiple providing computers (PCs). The TrC selects a target computer (TaC) from the multiple PCs to host the SV. The TrC instructs each PC, excluding the TaC, to transfer the SV and related edges that are connected to the SV to the TaC. A vertex's internal identifier indicates which computer hosts the vertex. The TrC maintains a mapping between external and internal identifiers.
-
公开(公告)号:US10318355B2
公开(公告)日:2019-06-11
申请号:US15413811
申请日:2017-01-24
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Jinsu Lee , Sungpack Hong , Siegfried Depner , Nicholas Roth , Thomas Manhardt , Hassan Chafi
Abstract: Techniques herein provide job control and synchronization of distributed graph-processing jobs. In an embodiment, a computer system maintains an input queue of graph processing jobs. In response to de-queuing a graph processing job, a master thread partitions the graph processing job into distributed jobs. Each distributed job has a sequence of processing phases. The master thread sends each distributed job to a distributed processor. Each distributed job executes a first processing phase of its sequence of processing phases. To the master thread, the distributed job announces completion of its first processing phase. The master thread detects that all distributed jobs have announced finishing their first processing phase. The master thread broadcasts a notification to the distributed jobs that indicates that all distributed jobs have finished their first processing phase. Receiving that notification causes the distributed jobs to execute their second processing phase. Queues and barriers provide for faults and cancellation.
-
19.
公开(公告)号:US20180352026A1
公开(公告)日:2018-12-06
申请号:US15607985
申请日:2017-05-30
Applicant: Oracle International Corporation
Inventor: Siegfried Depner , Sungpack Hong , Thomas Manhardt , Jinsu Lee , Nicholas Roth , Hassan Chafi
Abstract: Techniques minimize communication while loading a graph. In a distributed embodiment, each computer loads some edges of the graph. Each edge connects a source vertex (SV) to a destination vertex. For each SV of the edges, the computer hashes the SV to detect a tracking computer (TrC) that tracks on which computer does the SV reside. Each computer informs the TrC that the SV originates an edge that resides on that computer. For each SV, the TrC detects that the SV originates edges that reside on multiple providing computers (PCs). The TrC selects a target computer (TaC) from the multiple PCs to host the SV. The TrC instructs each PC, excluding the TaC, to transfer the SV and related edges that are connected to the SV to the TaC. A vertex's internal identifier indicates which computer hosts the vertex. The TrC maintains a mapping between external and internal identifiers.
-
-
-
-
-
-
-
-