Abstract:
A method and apparatus for a graph database instance (GDI) maintaining a secondary index, that indexes data from a sparse data map storing graph application data, within a sparse data map dedicated to the secondary index. The GDI formulates row-keys, for the secondary index map, by hashing the values of key/value pairs stored in rows of a map storing application data. The GDI stores for each formulated row-key, in the row of the secondary index that is indexed by the formulated row-key, references to rows of the map storing application data that match the key/value pair on which formulation of the row-key was based. The row-keys into the secondary index map may incorporate bucket identifiers, which, for each key/value pair, allows the GDI to spread the references to graph elements that match the key/value pair among a set number of “buckets” for the key/value pair within the secondary index map.
Abstract:
A method, system, and computer program product for transforming RDF quads to relational views. The method commences by receiving a named graph, the named graph comprising at least one RDF quad, and analyzing the named graph to produce analysis metadata. The method uses the analysis metadata to generate relational views. The method further comprises publishing a relational view in the form of a SPARQL query. The quality of the results can be quantitatively measured and reported by calculating a goodness score based at least in part on aspects of the relational view definitions. Several variants for transformation include generating relational view definitions formed using a named-graph strict variant, or a named-graph relaxed variant, or a named-graph agnostic variant. The transformations can form outputs responsive to characteristics or properties such as a number of classes, a number of single-valued properties, a number of nullability properties, or a number of type-uniformed ranges.
Abstract:
Techniques for storing and processing graph data in a database system are provided. Graph data (or a portion thereof) that is stored in persistent storage is loaded into memory to generate an instance of a particular graph. The instance is consistent as of a particular point in time. Graph analysis operations are performed on the instance. The instance may be used by multiple users to perform graph analysis operations. Subsequent changes to the graph are stored separate from the instance. Later, the changes may be applied to the instance (or a copy thereof) to refresh the instance.
Abstract:
Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.
Abstract:
Techniques for efficiently loading graph data into memory are provided. A plurality of node ID lists are retrieved from storage. Each node ID list is ordered based on one or more order criteria, such as node ID, and is read into memory. A new list of node IDs is created in memory and is initially empty. From among the plurality of node ID lists, a particular node ID is selected based on the one or more order criteria, removed from the node ID list where the particular node ID originates, and added to the new list. This process of selecting, removing, and adding continues until no more than one node ID list exists, other than the new list. In this way, the retrieval of the plurality of node ID lists from storage may be performed in parallel while the selecting and adding are performed sequentially.
Abstract:
Techniques for storing and processing graph data in a database system are provided. Graph data (or a portion thereof) that is stored in persistent storage is loaded into memory to generate an instance of a particular graph. The instance is consistent as of a particular point in time. Graph analysis operations are performed on the instance. The instance may be used by multiple users to perform graph analysis operations. Subsequent changes to the graph are stored separate from the instance. Later, the changes may be applied to the instance (or a copy thereof) to refresh the instance.
Abstract:
A method and apparatus for a graph database instance (GDI) maintaining a secondary index, that indexes data from a sparse data map storing graph application data, within a sparse data map dedicated to the secondary index. The GDI formulates row-keys, for the secondary index map, by hashing the values of key/value pairs stored in rows of a map storing application data. The GDI stores for each formulated row-key, in the row of the secondary index that is indexed by the formulated row-key, references to rows of the map storing application data that match the key/value pair on which formulation of the row-key was based. The row-keys into the secondary index map may incorporate bucket identifiers, which, for each key/value pair, allows the GDI to spread the references to graph elements that match the key/value pair among a set number of “buckets” for the key/value pair within the secondary index map.
Abstract:
Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.
Abstract:
Techniques for storing and querying graph data in a key-value store are provided. A graph statement (e.g., an RDF graph statement) includes a plurality of values, at least two of which correspond to nodes in a graph. A key is generated based on the graph statement. The key may be generated based on concatenating hash values that are generated based on the plurality of values. The key-value store stores the key. The value that corresponds to the key may be a null or empty value. In response to a graph query (e.g., in SPARQL) that includes one or more query patterns, each of which includes one or more values, a key is generated based on the one or more values and sent to the key-value store, which returns one or more other keys, each of which is a superset of the generated key.
Abstract:
A method, system, and computer program product for transforming RDF quads to relational views. The method commences by receiving a named graph, the named graph comprising at least one RDF quad, and analyzing the named graph to produce analysis metadata. The method uses the analysis metadata to generate relational views. The method further comprises publishing a relational view in the form of a SPARQL query. The quality of the results can be quantitatively measured and reported by calculating a goodness score based at least in part on aspects of the relational view definitions. Several variants for transformation include generating relational view definitions formed using a named-graph strict variant, or a named-graph relaxed variant, or a named-graph agnostic variant. The transformations can form outputs responsive to characteristics or properties such as a number of classes, a number of single-valued properties, a number of nullability properties, or a number of type-uniformed ranges.