Knowledge Graph Generation for Data Warehouse

    公开(公告)号:US20240273092A1

    公开(公告)日:2024-08-15

    申请号:US18188059

    申请日:2023-03-22

    Inventor: Amit Aggarwal

    CPC classification number: G06F16/24522 G06F16/2246 G06F16/24539

    Abstract: Query language statements are generated from natural language statements using a knowledge graph representing one or more databases. The knowledge graph is obtained by creating nodes representing tables and operations referenced by queries to the databases. The data of the databases is evaluated to identify entities and dimensions of entities from among the nodes. The entities are assigned human-understandable labels by an LLM. A natural language statement is converted to a knowledge graph language (KGL) statement and references in the KGL statement are replaced with references to entities in the knowledge graph. The KGL statement is then programmatically converted to a database language statement.

    MULTI-CLUSTER QUERY RESULT CACHING
    4.
    发明公开

    公开(公告)号:US20240265010A1

    公开(公告)日:2024-08-08

    申请号:US18221735

    申请日:2023-07-13

    CPC classification number: G06F16/24539 G06F16/24542 G06F16/256 G06F16/285

    Abstract: A multi-cluster computing system which includes a query result caching system is presented. The multi-cluster computing system may include a data processing service and client devices communicatively coupled over a network. The data processing service may include a control layer and a data layer. The control layer may be configured to receive and process requests from the client devices and manage resources in the data layer. The data layer may be configured to include instances of clusters of computing resources for executing jobs. The data layer may include a data storage system, which further includes a remote query result cache Store. The query result cache store may include a cloud storage query result cache which stores data associated with results of previously executed requests. As such, when a cluster encounters a previously executed request, the cluster may efficiently retrieve the cached result of the request from the in-memory query result cache or the cloud storage query result cache.

    STATIC APPROACH TO LAZY MATERIALIZATION IN DATABASE SCANS USING PUSHED FILTERS

    公开(公告)号:US20240256539A1

    公开(公告)日:2024-08-01

    申请号:US18160850

    申请日:2023-01-27

    CPC classification number: G06F16/24539 G06F16/221

    Abstract: Disclosed herein is a method for determining whether to apply a lazy materialization technique to a query run. The method includes receiving a request to perform a new query in a columnar database containing a plurality of columns. A step in the method includes accessing a set of data in a column of the plurality of columns based on the query. The method includes generating an input to a machine-learned model comprising characteristics of the set of data in the column. From the machine-learned model, the method includes generating a likelihood value indicative of whether a filter of a first portion of the set of data in the column has greater efficiency than a download followed by a filter of the set of data in the column. The method further includes comparing the likelihood value to a threshold value. Based on the comparison, the method includes filtering the first portion of the set of data before downloading the set of data if the likelihood value is equal to or above the threshold value.

    QUERY EXPRESSION RESULT CACHING USING DYNAMIC JOIN INDEX

    公开(公告)号:US20240220501A1

    公开(公告)日:2024-07-04

    申请号:US18089833

    申请日:2022-12-28

    CPC classification number: G06F16/24544 G06F11/3419 G06F16/2272 G06F16/24539

    Abstract: An apparatus, method and computer program product for query optimization in a Relational Database Management System (RDBMS), wherein an optimizer accesses a query expression repository (QER) storing planning and execution information for QEs from previous queries, wherein the QEs comprise table relations, intermediate results and/or final results of operations in the previous queries. Additionally, dynamic join indexes representing QE results are created for high-value QEs selected from the QER and maintained within a DJI repository. During query plan creation for a current or subsequent query, the optimizer searches the QER and DJI repository for DJIs created for high-value QEs corresponding to QEs contained in the current or subsequent query. DJIs corresponding to the matching QEs are used in the query planning phase to rewrite the current or subsequent user query so that stored QE results are used to answer QEs contained in the current or subsequent query.

    COMPUTATIONAL LOAD MITIGATION WITH PREDICTIVE SEARCH REQUEST ENRICHMENT

    公开(公告)号:US20240061838A1

    公开(公告)日:2024-02-22

    申请号:US18496012

    申请日:2023-10-27

    Applicant: AMADEUS S.A.S.

    CPC classification number: G06F16/24539 G06F16/24556

    Abstract: A method at an aggregator includes: storing previous search results resulting from previous client search requests, and for each previous search result, a previous handling indicator, indicating a relevance of the previous search result to the client; receiving, from the client, a search request containing search parameters; in response to the search request, selecting a subset of previous search results based on correspondence between attributes of the previous search results and the search parameters, and on the previous handling indicators; providing, to a supplier, the search request and auxiliary search parameters corresponding to the selected previous search results and indicating characteristics of the selected previous search results, for generation of current search results at the supplier employing the auxiliary search parameters as inputs; receiving, from the supplier, the current search results generated at the supplier; and returning at least one of the current search results to the client.

    Usage record aggregation
    8.
    发明授权

    公开(公告)号:US11899663B2

    公开(公告)日:2024-02-13

    申请号:US17476243

    申请日:2021-09-15

    Applicant: Stripe, Inc.

    CPC classification number: G06F16/24539 G06F16/244

    Abstract: In an example embodiment, a solution is provided that aggregates records as they are submitted to a third party (on the write path) rather than performing a real-time aggregation when a request is processed that needs the aggregation (read path). More particularly, in an example embodiment, a caching layer is introduced that avoids having to read all usage events to compute an aggregation when a request is received for aggregated data. The caching layer maintains values for various metrics that require aggregation.

    DATA PROCESSING DEVICE, DATA PROCESSING METHOD, AND DATA PROCESSING PROGRAM

    公开(公告)号:US20240020304A1

    公开(公告)日:2024-01-18

    申请号:US18031768

    申请日:2020-10-15

    Inventor: Yuya WATARI

    CPC classification number: G06F16/24544 G06F16/24539

    Abstract: A data processing device includes: a recording unit (21) that records, as a history of a plan tree of each issued query, an execution result of the plan tree, a history of specific information for specifying each node of the plan tree, and an appearance frequency at which the plan tree has appeared in the past; and a cache reuse unit (25) that obtains specific information corresponding to an execution plan being executed, to refer to the history of the plan tree by using the obtained specific information as a key, and reuses the execution result of the plan tree of the obtained specific information when the obtained specific information exists.

Patent Agency Ranking