Abstract:
Embodiments use successive refinement to allow a user to systematically explore the result set of an arbitrary query over RDF, such as a SPARQL query. A user inputs an arbitrary base query and modifies this query by replacing selected variables with values to which each selected variable is bound within the result set of the base query. Embodiments present, via a GUI, variable facets that may be substituted for query variables. Embodiments also present, through a GUI, a query history graph that represents query versions that a user has created. A user may navigate this query history graph to return to previously-created query versions. The GUI also provides information about the facets, including a number of results that would be included in the result set of the query version resulting from substitution of the facet for the associated variable.
Abstract:
Embodiments use successive refinement to allow a user to systematically explore the result set of an arbitrary query over RDF, such as a SPARQL query. A user inputs an arbitrary base query and modifies this query by replacing selected variables with values to which each selected variable is bound within the result set of the base query. Embodiments present, via a GUI, variable facets that may be substituted for query variables. Embodiments also present, through a GUI, a query history graph that represents query versions that a user has created. A user may navigate this query history graph to return to previously-created query versions. The GUI also provides information about the facets, including a number of results that would be included in the result set of the query version resulting from substitution of the facet for the associated variable.
Abstract:
Techniques for storing and processing graph data in a database system are provided. Graph data (or a portion thereof) that is stored in persistent storage is loaded into memory to generate an instance of a particular graph. The instance is consistent as of a particular point in time. Graph analysis operations are performed on the instance. The instance may be used by multiple users to perform graph analysis operations. Subsequent changes to the graph are stored separate from the instance. Later, the changes may be applied to the instance (or a copy thereof) to refresh the instance.
Abstract:
A method and apparatus for a graph database instance (GDI) maintaining a secondary index, that indexes data from a sparse data map storing graph application data, within a sparse data map dedicated to the secondary index. The GDI formulates row-keys, for the secondary index map, by hashing the values of key/value pairs stored in rows of a map storing application data. The GDI stores for each formulated row-key, in the row of the secondary index that is indexed by the formulated row-key, references to rows of the map storing application data that match the key/value pair on which formulation of the row-key was based. The row-keys into the secondary index map may incorporate bucket identifiers, which, for each key/value pair, allows the GDI to spread the references to graph elements that match the key/value pair among a set number of “buckets” for the key/value pair within the secondary index map.
Abstract:
Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.
Abstract:
Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.
Abstract:
Embodiments generate random walks through a directed graph that is represented in a relational database table. Each row of the graph table represents a directed edge in the graph and includes a source vertex and a destination vertex. Each row is further augmented to (a) indicate the number of outbound edges starting from the destination vertex in the row and (b) include an identifier that distinguishes the edge from other outbound edges starting from the same source vertex. An SQL query may be executed on the augmented graph table. Starting from a source vertex (starting vertex or the destination vertex of the previously selected hop) the query randomly selects a row of the graph table representing one of the outbound edges from the source vertex and adds the selected outbound edge as a row in a random walk table that represents the next hop in the random walk.
Abstract:
Systems, methods, and other embodiments associated with equivalence reasoning are described. One example method includes iteratively inputting batches of unprocessed equivalence pairs from a semantic model to an operating memory. In the operating memory, one or more cliques for the input batches are built until no further batches remain. A clique designates a canonical representative resource for a group of equivalent resources as determined from the equivalence pairs. The one or more cliques are built for the input batches to a clique map in a remote access memory. The clique map is returned for use by the semantic model.
Abstract:
Techniques for efficiently loading graph data into memory are provided. A plurality of node ID lists are retrieved from storage. Each node ID list is ordered based on one or more order criteria, such as node ID, and is read into memory. A new list of node IDs is created in memory and is initially empty. From among the plurality of node ID lists, a particular node ID is selected based on the one or more order criteria, removed from the node ID list where the particular node ID originates, and added to the new list. This process of selecting, removing, and adding continues until no more than one node ID list exists, other than the new list. In this way, the retrieval of the plurality of node ID lists from storage may be performed in parallel while the selecting and adding are performed sequentially.
Abstract:
Systems, methods, and other embodiments associated with data sources adapted for parallel inference on triples associated with a semantic model are described. One example method includes creating a source table that is partitioned on triple predicate and stores triples for entailment. The source table may store compact triple identifiers that have been mapped to triple identifiers from the semantic model.