Fast in-memory technique to build a reverse CSR graph index in an RDBMS

    公开(公告)号:US11537579B2

    公开(公告)日:2022-12-27

    申请号:US16816686

    申请日:2020-03-12

    Abstract: In an embodiment, a computer obtains a mapping of a relational schema of a database to a graph data model. The relational schema identifies vertex table(s) that correspond to vertex type(s) in the graph data model and edge table(s) that correspond to edge type(s) in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. Based on that mapping, a forward compressed sparse row (CSR) representation is populated for forward traversal of edges of a same edge type. Each edge originates at a source vertex and terminates at a target vertex. Based on the forward CSR representation, a reverse CSR representation of the edge type is populated for reverse traversal of the edges of the edge type. Acceleration occurs in two ways. Values calculated for the forward CSR are reused for the reverse CSR. Elastic and inelastic scaling may occur.

    PARALLEL AND EFFICIENT TECHNIQUE FOR BUILDING AND MAINTAINING A MAIN MEMORY CSR BASED GRAPH INDEX IN A RDBMS

    公开(公告)号:US20210334249A1

    公开(公告)日:2021-10-28

    申请号:US17370418

    申请日:2021-07-08

    Abstract: Herein are techniques that concurrently populate entries in a compressed sparse row (CSR) encoding, of a type of edge of a heterogenous graph. In an embodiment, a computer obtains a mapping of a relational schema to a graph data model. The relational schema defines vertex tables that correspond to vertex types in the graph data model, and edge tables that correspond to edge types in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. For each vertex type, a sequence of persistent identifiers of vertices is obtained. Based on the mapping and for a CSR representation of each edge type, a source array is populated that, for a same vertex ordering as the sequence of persistent identifiers for the source vertex type, is based on counts of edges of the edge type that originate from vertices of the source vertex type. For the CSR, the computer populates, in parallel and based on said mapping, a destination array that contains canonical offsets as sequence positions within the sequence of persistent identifiers of the vertices.

    FAST IN-MEMORY TECHNIQUE TO BUILD A REVERSE CSR GRAPH INDEX IN AN RDBMS

    公开(公告)号:US20210286790A1

    公开(公告)日:2021-09-16

    申请号:US16816686

    申请日:2020-03-12

    Abstract: In an embodiment, a computer obtains a mapping of a relational schema of a database to a graph data model. The relational schema identifies vertex table(s) that correspond to vertex type(s) in the graph data model and edge table(s) that correspond to edge type(s) in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. Based on that mapping, a forward compressed sparse row (CSR) representation is populated for forward traversal of edges of a same edge type. Each edge originates at a source vertex and terminates at a target vertex. Based on the forward CSR representation, a reverse CSR representation of the edge type is populated for reverse traversal of the edges of the edge type. Acceleration occurs in two ways. Values calculated for the forward CSR are reused for the reverse CSR. Elastic and inelastic scaling may occur.

    PARALLEL AND EFFICIENT TECHNIQUE FOR BUILDING AND MAINTAINING A MAIN MEMORY CSR BASED GRAPH INDEX IN A RDBMS

    公开(公告)号:US20210224235A1

    公开(公告)日:2021-07-22

    申请号:US16747827

    申请日:2020-01-21

    Abstract: Herein are techniques that concurrently populate entries in a compressed sparse row (CSR) encoding, of a type of edge of a heterogenous graph. In an embodiment, a computer obtains a mapping of a relational schema to a graph data model. The relational schema defines vertex tables that correspond to vertex types in the graph data model, and edge tables that correspond to edge types in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. For each vertex type, a sequence of persistent identifiers of vertices is obtained. Based on the mapping and for a CSR representation of each edge type, a source array is populated that, for a same vertex ordering as the sequence of persistent identifiers for the source vertex type, is based on counts of edges of the edge type that originate from vertices of the source vertex type. For the CSR, the computer populates, in parallel and based on said mapping, a destination array that contains canonical offsets as sequence positions within the sequence of persistent identifiers of the vertices.

    Exploiting Intra-Process And Inter-Process Parallelism To Accelerate In-Memory Processing Of Graph Queries In Relational Databases

    公开(公告)号:US20250138874A1

    公开(公告)日:2025-05-01

    申请号:US18384743

    申请日:2023-10-27

    Abstract: The illustrative embodiments provide techniques that utilizes graph topology information to partition work according to ranges of vertices so that each unit of work can be computed independently by different worker processes (inter-process parallelism). The illustrative embodiments also provide an approach for decomposing the graph neighbor matching operations and the property projection operation into fine-grained configurable size tasks that can be processed independently by threads (intra-process parallelism) without the need for expensive synchronization primitives. For graph neighbor matching operations, a given set of source vertices is split into smaller tasks that are assigned to dedicated threads for processing. Each thread is responsible for computing a number of matching source vertices and propagating them to the next graph match operator for further processing. For property projection operations, the computed graph paths are organized into rows that contain the requested properties for each element of the path (vertices and/or edges).

Patent Agency Ranking