HASH-BASED EFFICIENT SECONDARY INDEXING FOR GRAPH DATA STORED IN NON-RELATIONAL DATA STORES

    公开(公告)号:US20170308621A1

    公开(公告)日:2017-10-26

    申请号:US15494141

    申请日:2017-04-21

    CPC classification number: G06F16/9024 G06F16/2255 G06F16/24575

    Abstract: A method and apparatus for a graph database instance (GDI) maintaining a secondary index, that indexes data from a sparse data map storing graph application data, within a sparse data map dedicated to the secondary index. The GDI formulates row-keys, for the secondary index map, by hashing the values of key/value pairs stored in rows of a map storing application data. The GDI stores for each formulated row-key, in the row of the secondary index that is indexed by the formulated row-key, references to rows of the map storing application data that match the key/value pair on which formulation of the row-key was based. The row-keys into the secondary index map may incorporate bucket identifiers, which, for each key/value pair, allows the GDI to spread the references to graph elements that match the key/value pair among a set number of “buckets” for the key/value pair within the secondary index map.

    Using persistent data samples and query-time statistics for query optimization

    公开(公告)号:US09798772B2

    公开(公告)日:2017-10-24

    申请号:US13893047

    申请日:2013-05-13

    CPC classification number: G06F17/30442 G06F17/30277 G06F17/30424

    Abstract: Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.

    USING PERSISTENT DATA SAMPLES AND QUERY-TIME STATISTICS FOR QUERY OPTIMIZATION
    6.
    发明申请
    USING PERSISTENT DATA SAMPLES AND QUERY-TIME STATISTICS FOR QUERY OPTIMIZATION 有权
    使用持续数据样本和查询优化的查询统计信息

    公开(公告)号:US20140310260A1

    公开(公告)日:2014-10-16

    申请号:US13893047

    申请日:2013-05-13

    CPC classification number: G06F17/30442 G06F17/30277 G06F17/30424

    Abstract: Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.

    Abstract translation: 提供了在键值存储中存储和查询图形数据的技术。 图形语句(例如,RDF图形语句)包括多个值,其中至少两个对应于图中的节点。 基于图形语句生成一个键。 可以基于基于多个值生成的级联哈希值来生成密钥。 键值存储存储键。 对应于该键的值可以是空值或空值。 响应于包括一个或多个查询模式的图形查询(例如,在SPARQL中),每个查询模式包括一个或多个值,基于一个或多个值生成密钥并发送到键值存储,其中 返回一个或多个其他键,每个键是生成的键的超集。

    EFFICIENT SQL-BASED GRAPH RANDOM WALK

    公开(公告)号:US20210049171A1

    公开(公告)日:2021-02-18

    申请号:US16543258

    申请日:2019-08-16

    Abstract: Embodiments generate random walks through a directed graph that is represented in a relational database table. Each row of the graph table represents a directed edge in the graph and includes a source vertex and a destination vertex. Each row is further augmented to (a) indicate the number of outbound edges starting from the destination vertex in the row and (b) include an identifier that distinguishes the edge from other outbound edges starting from the same source vertex. An SQL query may be executed on the augmented graph table. Starting from a source vertex (starting vertex or the destination vertex of the previously selected hop) the query randomly selects a row of the graph table representing one of the outbound edges from the source vertex and adds the selected outbound edge as a row in a random walk table that represents the next hop in the random walk.

    Hybrid approach for equivalence reasoning

    公开(公告)号:US10437873B2

    公开(公告)日:2019-10-08

    申请号:US14047318

    申请日:2013-10-07

    Abstract: Systems, methods, and other embodiments associated with equivalence reasoning are described. One example method includes iteratively inputting batches of unprocessed equivalence pairs from a semantic model to an operating memory. In the operating memory, one or more cliques for the input batches are built until no further batches remain. A clique designates a canonical representative resource for a group of equivalent resources as determined from the equivalence pairs. The one or more cliques are built for the input batches to a clique map in a remote access memory. The clique map is returned for use by the semantic model.

    Constructing an in-memory representation of a graph

    公开(公告)号:US10055509B2

    公开(公告)日:2018-08-21

    申请号:US14680150

    申请日:2015-04-07

    CPC classification number: G06F16/9024 G06F16/2246 G06F2201/80

    Abstract: Techniques for efficiently loading graph data into memory are provided. A plurality of node ID lists are retrieved from storage. Each node ID list is ordered based on one or more order criteria, such as node ID, and is read into memory. A new list of node IDs is created in memory and is initially empty. From among the plurality of node ID lists, a particular node ID is selected based on the one or more order criteria, removed from the node ID list where the particular node ID originates, and added to the new list. This process of selecting, removing, and adding continues until no more than one node ID list exists, other than the new list. In this way, the retrieval of the plurality of node ID lists from storage may be performed in parallel while the selecting and adding are performed sequentially.

Patent Agency Ranking