Data ingestion using data file clustering with KD-epsilon trees

    公开(公告)号:US12072863B1

    公开(公告)日:2024-08-27

    申请号:US18218400

    申请日:2023-07-05

    CPC classification number: G06F16/2246 G06F16/2358 G06F16/245 G06F16/285

    Abstract: A data tree for managing data files of a data table and performing one or more transaction operations to the data table is described. The data tree is configured as a KD-epsilon tree and includes a plurality of nodes and edges. A node of the data tree may represent a splitting condition with respect to key-values for a respective key. A leaf node of the data tree may correspond to a data file for a data table that includes a subset of records having key-values that satisfy the condition for the node and conditions associated with parent nodes of the node. A parent node may correspond to a file including a buffer that stores changes to data files reachable by this parent node, and also includes dedicated storage to pointers of the child nodes. By using the data tree, the data processing system may efficiently cluster the data in the data table while reducing the number of data files that are rewritten.

    Structured cluster execution for data streams

    公开(公告)号:US12032573B2

    公开(公告)日:2024-07-09

    申请号:US17976361

    申请日:2022-10-28

    CPC classification number: G06F16/24542 G06F16/24568

    Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

    Fetching query results through cloud object stores

    公开(公告)号:US11960494B1

    公开(公告)日:2024-04-16

    申请号:US17841946

    申请日:2022-06-16

    CPC classification number: G06F16/2471 G06F11/3419 G06F16/244 G06F16/256

    Abstract: The system is configured to: 1) receive a client request; 2) determine executor(s) to generate a response to the user request; 3) provide each of the executor(s) with an indication; 4) receive for each indication a response including an output of either a cloud output or an in-line output to generate a group of in-line outputs and a group of cloud outputs; 5) determine whether the group of in-line outputs comprises all outputs; and 6) in response to the group of in-line outputs not comprising all the outputs for the client request: a) convert the group of in-line outputs to a converted group of cloud outputs; b) generate metadata for the converted group of cloud outputs and the group of cloud outputs; and c) provide response to the client request including the metadata for the converted group of cloud outputs and the group of cloud outputs.

    Model ML registry and model serving
    117.
    发明授权

    公开(公告)号:US11853277B2

    公开(公告)日:2023-12-26

    申请号:US18162579

    申请日:2023-01-31

    CPC classification number: G06F16/219 G06F16/955 G06N5/022

    Abstract: A system includes an interface, a processor, and a memory. The interface is configured to receive a version of a model from a model registry. The processor is configured to store the version of the model, start a process running the version of the model, and update a proxy with version information associated with the version of the model, wherein the updated proxy indicates to redirect an indication to invoke the version of the model to the process. The memory is coupled to the processor and configured to provide the processor with instructions.

    OPTIMIZATION OF TUNING FOR MODELS USED FOR MULTIPLE PREDICTION GENERATION

    公开(公告)号:US20230244982A1

    公开(公告)日:2023-08-03

    申请号:US17587793

    申请日:2022-01-28

    CPC classification number: G06N20/00

    Abstract: The present application discloses a method, system, and computer system for tuning a set of models. The method includes determining a set of one or more models to optimize, determining a plurality of optimizer modules with which to optimize the set of one or more models, causing the plurality of optimizer modules to respectively perform a respective optimizing process with respect to at least one model of the set of one or more models, and deploying an optimized model obtained based at least in part on optimizing metrics of the set of the one or more models.

    ACCESS OF DATA AND MODELS ASSOCIATED WITH MULTIPLE PREDICTION GENERATION

    公开(公告)号:US20230244720A1

    公开(公告)日:2023-08-03

    申请号:US17587820

    申请日:2022-01-28

    CPC classification number: G06F16/90335 G06N20/00

    Abstract: The present application discloses a method, system, and computer system for querying a model associated with a dataset. The method includes providing an input interface via which a first entity inputs a dataset, receiving the dataset, and providing a selection interface that exposes to a second entity the plurality of models determined for the dataset and/or the plurality of results corresponding to the plurality of models using the index entries. The dataset comprises a plurality of keys and a plurality of key-value relationships, and the dataset is formatted according to a predefined format, wherein index entries are generated for a plurality of models and a plurality of results corresponding to the plurality of models.

    Model ML registry and model serving
    120.
    发明授权

    公开(公告)号:US11693837B2

    公开(公告)日:2023-07-04

    申请号:US17324907

    申请日:2021-05-19

    CPC classification number: G06F16/219 G06F16/955 G06N5/022

    Abstract: A system includes an interface, a processor, and a memory. The interface is configured to receive a version of a model from a model registry. The processor is configured to store the version of the model, start a process running the version of the model, and update a proxy with version information associated with the version of the model, wherein the updated proxy indicates to redirect an indication to invoke the version of the model to the process. The memory is coupled to the processor and configured to provide the processor with instructions.

Patent Agency Ranking