Access-frequency-based entity replication techniques for distributed property graphs with schema

    公开(公告)号:US11907255B2

    公开(公告)日:2024-02-20

    申请号:US17686938

    申请日:2022-03-04

    CPC classification number: G06F16/27 G06F16/2282 G06F16/284

    Abstract: In an embodiment, multiple computers cooperate to retrieve content from tables in a relational database. Each table contains respective rows. Each row contains a vertex of a graph. Many high-degree vertices are identified. Each high-degree vertex is connected to respective edges in the graph. A count of the edges of each high-degree vertex exceeds a degree threshold. A central computer detects that all vertices in a high-degree subset of tables are high-degree vertices. Based on detecting the high-degree subset of tables, multiple vertices of the graph that are not in the high-degree subset of tables are replicated. Within local storage capacity limits of the computers, this degree-based replication may be supplemented with other vertex replication strategies that are schema based, content based, or workload based. This intelligent selective replication maximizes system throughput by minimizing graph data access latency based on data locality.

    ACCESS-FREQUENCY-BASED ENTITY REPLICATION TECHNIQUES FOR DISTRIBUTED PROPERTY GRAPHS WITH SCHEMA

    公开(公告)号:US20230281219A1

    公开(公告)日:2023-09-07

    申请号:US17686938

    申请日:2022-03-04

    CPC classification number: G06F16/27 G06F16/284 G06F16/2282

    Abstract: In an embodiment, multiple computers cooperate to retrieve content from tables in a relational database. Each table contains respective rows. Each row contains a vertex of a graph. Many high-degree vertices are identified. Each high-degree vertex is connected to respective edges in the graph. A count of the edges of each high-degree vertex exceeds a degree threshold. A central computer detects that all vertices in a high-degree subset of tables are high-degree vertices. Based on detecting the high-degree subset of tables, multiple vertices of the graph that are not in the high-degree subset of tables are replicated. Within local storage capacity limits of the computers, this degree-based replication may be supplemented with other vertex replication strategies that are schema based, content based, or workload based. This intelligent selective replication maximizes system throughput by minimizing graph data access latency based on data locality.

    FAST AND MEMORY-EFFICIENT DISTRIBUTED GRAPH MUTATIONS

    公开(公告)号:US20230237047A1

    公开(公告)日:2023-07-27

    申请号:US17585117

    申请日:2022-01-26

    CPC classification number: G06F16/2379 G06F16/9024

    Abstract: Data structures and methods are described for applying mutations on a distributed graph in a fast and memory-efficient manner. Nodes in a distributed graph processing system may store graph information such as vertices, edges, properties, vertex keys, vertex degree counts, and other information in graph arrays, which are divided into shared arrays and delta logs. The shared arrays on a local node remain immutable and are the starting point of a graph, on top of which mutations build new snapshots. Mutations may be supported at both the entity and table levels. Periodic delta log consolidation may occur at multiple levels to prevent excessive delta log buildup. Consolidation at the table level may also trigger rebalancing of vertices across the nodes.

Patent Agency Ranking