SPACE-EFFICIENT METHODOLOGY FOR REPRESENTING LABEL INFORMATION IN LARGE GRAPH DATA FOR FAST DISTRIBUTED GRAPH QUERY

    公开(公告)号:US20200073868A1

    公开(公告)日:2020-03-05

    申请号:US16378424

    申请日:2019-04-08

    Abstract: Techniques are described herein for space-efficient encoding of label information of property graphs. In an embodiment, an input graph is received. The input graph comprises a plurality of entities and a plurality of label sets. Each entity of said plurality of entities is associated with a label set of the plurality of label sets and each label set of the plurality of label sets comprises zero or more labels of a plurality of labels. A first mapping is generated that maps each label of the plurality of labels to a label code. A second mapping is generated that maps each label integer set of a plurality of label integer sets to a label code. Each label integer set of the plurality of label integer sets corresponds to a label set of the plurality of label sets, wherein each label integer set of the plurality of label integer sets comprises label codes from the first mapping that are mapped to each label included in the corresponding label set. A compressed label set is generated for each entity of the plurality of entities. Each compressed label set comprises a plurality of bits that indicate a zeroth state, a first state, a second state, or a third state. The compressed label sets and the first and second mappings are used to efficiently evaluate graph label queries.

    Efficient, in-memory, relational representation for heterogeneous graphs

    公开(公告)号:US11120082B2

    公开(公告)日:2021-09-14

    申请号:US15956115

    申请日:2018-04-18

    Abstract: Techniques are provided herein for efficient representation of heterogeneous graphs in memory. In an embodiment, vertices and edges of the graph are segregated by type. Each property of a type of vertex or edge has values stored in a respective vector. Directed or undirected edges of a same type are stored in compressed sparse row (CSR) format. The CSR format is more or less repeated for edge traversal in either forward or reverse direction. An edge map translates edge offsets obtained from traversal in the reverse direction for use with data structures that expect edge offsets in the forward direction. Subsequent filtration and/or traversal by type or property of vertex or edge entails minimal data access and maximal data locality, thereby increasing efficient use of the graph.

    FAST PROCESSING OF PATH-FINDING QUERIES IN LARGE GRAPH DATABASES
    3.
    发明申请
    FAST PROCESSING OF PATH-FINDING QUERIES IN LARGE GRAPH DATABASES 审中-公开
    在大型图形数据库中快速处理路径查找问题

    公开(公告)号:US20170060958A1

    公开(公告)日:2017-03-02

    申请号:US14837696

    申请日:2015-08-27

    Abstract: Techniques herein are for fast processing of path-finding queries in large graph databases. A computer system receives a graph search request to find a set of result paths between one or more source vertices of a graph and one or more target vertices of the graph. The graph comprises vertices connected by edges. During a first pass, the computer system performs one or more breadth-first searches to identify a subset of edges of the graph. The one or more breadth-first searches originate at the one or more source vertices. After the first pass and during a second pass, the computer system performs one or more depth-first searches to identify the set of result paths. The one or more depth-first searches originate at the one or more target vertices. The one or more depth-first searches traverse at most the subset of edges of the graph.

    Abstract translation: 这里的技术是用于在大图数据库中快速处理路径查找查询。 计算机系统接收图形搜索请求以找到图形的一个或多个源顶点与该图的一个或多个目标顶点之间的一组结果路径。 该图包括通过边缘连接的顶点。 在第一次通过期间,计算机系统执行一个或多个宽度优先搜索以识别图的边缘的子集。 一个或多个宽度优先搜索起源于一个或多个源顶点。 在第一次通过和第二遍之后,计算机系统执行一个或多个深度优先搜索以识别该组结果路径。 一个或多个深度优先搜索起始于一个或多个目标顶点。 一个或多个深度优先搜索最多遍历图形边缘的子集。

    Efficient method for subgraph pattern matching

    公开(公告)号:US10896223B2

    公开(公告)日:2021-01-19

    申请号:US16223805

    申请日:2018-12-18

    Abstract: Techniques herein optimize subgraph pattern matching. A computer receives a graph vertex array and a graph edge array. Each vertex and each edge has labels. The computer stores an array of index entries and an array of edge label sets. Each index entry corresponds to a respective vertex originating an edge and associates an offset of the edge with an offset of the respective vertex. Each edge label set contains labels of a respective edge. The computer selects a candidate subset of edges originating at a current vertex. The edge labels of each candidate edge of the candidate subset include a same particular query edge labels. The computer selects the candidate subset based on the index array and afterwards selects a result subset of vertices from among the terminating vertices of the candidate edges. The labels of each vertex of the result subset include a same particular query vertex labels.

    EFFICIENT METHOD FOR SUBGRAPH PATTERN MATCHING

    公开(公告)号:US20170169133A1

    公开(公告)日:2017-06-15

    申请号:US14969789

    申请日:2015-12-15

    CPC classification number: G06F17/30958 G06F17/30324

    Abstract: Techniques herein optimize subgraph pattern matching. A computer receives a graph vertex array and a graph edge array. Each vertex and each edge has labels. The computer stores an array of index entries and an array of edge label sets. Each index entry corresponds to a respective vertex originating an edge and associates an offset of the edge with an offset of the respective vertex. Each edge label set contains labels of a respective edge. The computer selects a candidate subset of edges originating at a current vertex. The edge labels of each candidate edge of the candidate subset include a same particular query edge labels. The computer selects the candidate subset based on the index array and afterwards selects a result subset of vertices from among the terminating vertices of the candidate edges. The labels of each vertex of the result subset include a same particular query vertex labels.

    DETERMINISTIC SEMANTIC FOR GRAPH PROPERTY UPDATE QUERIES AND ITS EFFICIENT IMPLEMENTATION

    公开(公告)号:US20230095703A1

    公开(公告)日:2023-03-30

    申请号:US17479006

    申请日:2021-09-20

    Abstract: Efficiently implemented herein is a deterministic semantic for property updates by graph queries. Mechanisms of determinism herein ensure data consistency for graph mutation. These mechanisms facilitate optimistic execution of graph access despite a potential data access conflict. This approach may include various combinations of special activities such as detecting potential conflicts during query compile time, applying query transformations to eliminate those conflicts during code generation where possible, and executing updates in an optimistic way that safely fails if determinism cannot be guaranteed. In an embodiment, a computer receives a request to modify a graph. The request to modify the graph is optimistically executed after preparation and according to safety precautions as presented herein. Based on optimistically executing the request, a data access conflict actually occurs and is automatically detected. Based on the data access conflict, optimistically executing the request is prematurely and automatically halted without finishing executing the request.

    Space-efficient methodology for representing label information in large graph data for fast distributed graph query

    公开(公告)号:US11074260B2

    公开(公告)日:2021-07-27

    申请号:US16378424

    申请日:2019-04-08

    Abstract: Techniques are described herein for space-efficient encoding of label information of property graphs. In an embodiment, an input graph is received. The input graph comprises a plurality of entities and a plurality of label sets. Each entity of said plurality of entities is associated with a label set of the plurality of label sets and each label set of the plurality of label sets comprises zero or more labels of a plurality of labels. A first mapping is generated that maps each label of the plurality of labels to a label code. A second mapping is generated that maps each label integer set of a plurality of label integer sets to a label code. Each label integer set of the plurality of label integer sets corresponds to a label set of the plurality of label sets, wherein each label integer set of the plurality of label integer sets comprises label codes from the first mapping that are mapped to each label included in the corresponding label set. A compressed label set is generated for each entity of the plurality of entities. Each compressed label set comprises a plurality of bits that indicate a zeroth state, a first state, a second state, or a third state. The compressed label sets and the first and second mappings are used to efficiently evaluate graph label queries.

    Reachability graph index for query processing

    公开(公告)号:US10942970B2

    公开(公告)日:2021-03-09

    申请号:US16159384

    申请日:2018-10-12

    Abstract: Techniques are described for generating and re-using reachability graphs for efficient execution of queries. In an embodiment, a query is received for execution on a data graph. Such a query may include one or more expressions for edges in the data graph, which when executed select one or more paths in the data graph to generate results for the query. The system uses a repository to store reachability graphs and may determine whether a reachability graph for an expression of the query for the data graph is stored in a repository. Such a reachability graph is generated by applying the expression on the data graph to qualify or disqualify the edges in the data graph to be included as part of edges of the reachability graph. For example, an edge in a reachability graph exists between two vertices when at least one edge of the data graph has qualified between two vertices of the data graph that correspond to the two vertices of the reachability graph. Based on determining that the reachability graph for the expression is stored in the repository, the system executes the query on the reachability graph without re-applying the expression on the data graph and generates the results for the query.

    FAST GRAPH QUERY ENGINE OPTIMIZED FOR TYPICAL REAL-WORLD GRAPH INSTANCES WHOSE SMALL PORTION OF VERTICES HAVE EXTREMELY LARGE DEGREE

    公开(公告)号:US20180203897A1

    公开(公告)日:2018-07-19

    申请号:US15409091

    申请日:2017-01-18

    Abstract: Techniques herein accelerate graph querying by caching neighbor vertices (NVs) of super-node vertices. In an embodiment, a computer receives a graph query (GQ) to extract result paths from a graph in a database. The GQ has a sequence of query vertices (QVs) and a sequence of query edges (QEs). The computer successively traverses each QE and QV to detect paths of the graph that match the GQ. Traversing each QE and QV entails retrieving NVs of a current graph vertex (CGV) of a current traversal path. If the CGV is a key in a cache whose keys are graph vertices having an excessive degree, then the computer retrieves NVs from the cache. Otherwise, the computer retrieves NVs from the database. If the degree is excessive, and the CGV is not a key in the cache, then the computer stores, into the cache, the CGV as a key for the NVs.

Patent Agency Ranking