MEMORY-TRACKING RESOURCE MANAGER FOR ELASTIC DISTRIBUTED GRAPH-PROCESSING SYSTEM

    公开(公告)号:US20250094224A1

    公开(公告)日:2025-03-20

    申请号:US18369254

    申请日:2023-09-18

    Abstract: A resource manager tracks the amount of available memory for a cluster of machines and for each machine in the cluster. The resource manager receives a reservation request from a job for a graph processing operation. The reservation request specifies an identification of the job, a type of reservation, and an amount of memory requested. The resource manager determines whether to grant the reservation request based on the type of reservation, the amount of memory requested, and the amount of available memory in the cluster or in one or more machines in the cluster. In response to determining to grant the reservation request, the resource manager sends a response to the job indicating an amount of memory reserved and adjusts the amount of available cluster memory and the amount of available machine memory for at least one machine in the cluster based on the amount of memory reserved.

    DETERMINISTIC SEMANTIC FOR GRAPH PROPERTY UPDATE QUERIES AND ITS EFFICIENT IMPLEMENTATION

    公开(公告)号:US20230095703A1

    公开(公告)日:2023-03-30

    申请号:US17479006

    申请日:2021-09-20

    Abstract: Efficiently implemented herein is a deterministic semantic for property updates by graph queries. Mechanisms of determinism herein ensure data consistency for graph mutation. These mechanisms facilitate optimistic execution of graph access despite a potential data access conflict. This approach may include various combinations of special activities such as detecting potential conflicts during query compile time, applying query transformations to eliminate those conflicts during code generation where possible, and executing updates in an optimistic way that safely fails if determinism cannot be guaranteed. In an embodiment, a computer receives a request to modify a graph. The request to modify the graph is optimistically executed after preparation and according to safety precautions as presented herein. Based on optimistically executing the request, a data access conflict actually occurs and is automatically detected. Based on the data access conflict, optimistically executing the request is prematurely and automatically halted without finishing executing the request.

    DYNAMIC ASYNCHRONOUS TRAVERSALS FOR DISTRIBUTED GRAPH QUERIES

    公开(公告)号:US20210240705A1

    公开(公告)日:2021-08-05

    申请号:US16778668

    申请日:2020-01-31

    Abstract: Techniques are described for enabling in-memory execution of any-sized graph data query by utilizing both depth first search (DFS) principles and breadth first search (BFS) principles to control the amount of memory used during query execution. Specifically, threads implementing a graph DBMS switch between a BFS mode of data traversal and a DFS mode of data traversal. For example, when a thread detects that there are less than a configurable threshold number of intermediate results in memory, the thread enters BFS-based traversal techniques to increase the number of intermediate results in memory. When the thread detects that there are at least the configurable threshold number of intermediate results in memory, the thread enters DFS mode to produce final results, which generally works to move the intermediate results that are currently available in memory to final query results, thereby reducing the number of intermediate results in memory.

    Space-efficient methodology for representing label information in large graph data for fast distributed graph query

    公开(公告)号:US11074260B2

    公开(公告)日:2021-07-27

    申请号:US16378424

    申请日:2019-04-08

    Abstract: Techniques are described herein for space-efficient encoding of label information of property graphs. In an embodiment, an input graph is received. The input graph comprises a plurality of entities and a plurality of label sets. Each entity of said plurality of entities is associated with a label set of the plurality of label sets and each label set of the plurality of label sets comprises zero or more labels of a plurality of labels. A first mapping is generated that maps each label of the plurality of labels to a label code. A second mapping is generated that maps each label integer set of a plurality of label integer sets to a label code. Each label integer set of the plurality of label integer sets corresponds to a label set of the plurality of label sets, wherein each label integer set of the plurality of label integer sets comprises label codes from the first mapping that are mapped to each label included in the corresponding label set. A compressed label set is generated for each entity of the plurality of entities. Each compressed label set comprises a plurality of bits that indicate a zeroth state, a first state, a second state, or a third state. The compressed label sets and the first and second mappings are used to efficiently evaluate graph label queries.

    SUBQUERIES IN DISTRIBUTED ASYNCHRONOUS GRAPH QUERIES

    公开(公告)号:US20240220495A1

    公开(公告)日:2024-07-04

    申请号:US18091242

    申请日:2022-12-29

    CPC classification number: G06F16/24535 G06F16/24537 G06F16/9024

    Abstract: A graph processing engine is provided for executing a graph query comprising a parent query and a subquery nested within the parent query. The subquery uses a reference to one or more correlated variables from the parent query. Executing the graph query comprises initiating execution of the parent query, pausing the execution of the parent query responsive to the parent query matching the one or more correlated variables in an intermediate result set, generating a subquery identifier for each match of the one or more correlated variables, modifying the subquery to include a subquery aggregate function and a clause to group results by subquery identifier, executing the modified subquery using the intermediate result set and collecting subquery results into a subquery results table responsive to pausing execution of the parent query, and resuming execution of the parent query using the subquery results table.

    Duplication elimination in depth based searches for distributed systems

    公开(公告)号:US12001425B2

    公开(公告)日:2024-06-04

    申请号:US17116831

    申请日:2020-12-09

    CPC classification number: G06F16/24526 G06F16/24556 G06F16/248

    Abstract: Systems and methods for improving evaluation of graph queries through depth first traversals are described herein. In an embodiment, a multi-node system evaluates against graph data a graph query that specifies a particular pattern to match by determining, at a first node of the multi-node system, in a particular instance of evaluating the graph query, that one or more first vertices on the first node match a first portion of the graph query and that a second vertex that is to be evaluated next is stored on a second node separate from the first node. In response to determining that the next vertex to be evaluated is stored on the second node separate from the first node, the first node generates a message to the second node comprising one or more results of the first portion of the graph query based on the one or more first vertices, an identifier of the next vertex, and a current stage of evaluating the graph query. In response to generating the message from the first node to the second node, the first node ceases the particular instance of evaluating the graph query.

    Dynamic asynchronous traversals for distributed graph queries

    公开(公告)号:US11675785B2

    公开(公告)日:2023-06-13

    申请号:US16778668

    申请日:2020-01-31

    CPC classification number: G06F16/24526 G06F16/2471

    Abstract: Techniques are described for enabling in-memory execution of any-sized graph data query by utilizing both depth first search (DFS) principles and breadth first search (BFS) principles to control the amount of memory used during query execution. Specifically, threads implementing a graph DBMS switch between a BFS mode of data traversal and a DFS mode of data traversal. For example, when a thread detects that there are less than a configurable threshold number of intermediate results in memory, the thread enters BFS-based traversal techniques to increase the number of intermediate results in memory. When the thread detects that there are at least the configurable threshold number of intermediate results in memory, the thread enters DFS mode to produce final results, which generally works to move the intermediate results that are currently available in memory to final query results, thereby reducing the number of intermediate results in memory.

Patent Agency Ranking