EFFICIENT MERGE OF TABULAR DATA USING MIXING
    82.
    发明公开

    公开(公告)号:US20240070155A1

    公开(公告)日:2024-02-29

    申请号:US17895882

    申请日:2022-08-25

    CPC classification number: G06F16/2456 G06F16/2282

    Abstract: A method, system, and computer system for performing an operation with respect to a target table are disclosed. The method includes performing first and second jobs, and obtaining other resulting files based at least in part on a second set of unmatched rows among the target table and the source table that results from the first set of unmatched rows having been processed in the second job, and obtaining a resulting table based on (i) second job resulting file(s), and (ii) other resulting files. Performing the first job includes determining a set of matching target table files and storing target table information indicating for each of the set of matching target table files, a particular set of rows having matching rows. Performing the second job includes performing a first matching action based on matched rows and a second matching action based on a subset of unmatched rows.

    EFFICIENT MERGE OF TABULAR DATA USING A PROCESSING FILTER

    公开(公告)号:US20240069863A1

    公开(公告)日:2024-02-29

    申请号:US17895872

    申请日:2022-08-25

    CPC classification number: G06F7/14 G06F16/148 G06F16/16

    Abstract: A method, system, and computer system for performing an operation with respect to a target table are disclosed. The method includes performing first, second and a third jobs, and obtaining a resulting table based at least in part on the second job resulting file(s) and third job resulting file(s). Performing the first job includes determining a set of matching target table files and storing target table information indicating for each of the set of matching target table files, a particular set of rows having matching rows. Performing the second job includes performing a matching action based on matched rows and obtaining the second job resulting file(s). Performing the third job includes determining unmatched rows for target table files and storing the unmatched rows in third job resulting file(s).

    Scan Parsing
    84.
    发明公开
    Scan Parsing 审中-公开

    公开(公告)号:US20240061840A1

    公开(公告)日:2024-02-22

    申请号:US18162366

    申请日:2023-01-31

    CPC classification number: G06F16/24542 G06F16/285

    Abstract: The present application discloses a method, system, and computer system for parsing files. The method includes receiving an indication that a first file is to be processed, determining to begin processing the first file using a first processing engine based at least in part on one or more predefined heuristics, indicating to process the first file using a first processing engine, determining whether a particular error in processing the first file using the first processing engine has been detected, in response to determining that the particular error has been detected, indicate to stop processing the first file using the first processing engine and indicate to continue processing using a second processing engine, and storing in memory information obtained based on processing the first file by one or more of the first processing engine and the second processing engine.

    SCAN PARSING
    85.
    发明公开
    SCAN PARSING 审中-公开

    公开(公告)号:US20240061839A1

    公开(公告)日:2024-02-22

    申请号:US17892376

    申请日:2022-08-22

    CPC classification number: G06F16/24542 G06F16/285

    Abstract: The present application discloses a method, system, and computer system for parsing files. The method includes receiving an indication that a first file is to be processed, determining to begin processing the first file using a first processing engine based at least in part on one or more predefined heuristics, indicating to process the first file using a first processing engine, determining whether a particular error in processing the first file using the first processing engine has been detected, in response to determining that the particular error has been detected, indicate to stop processing the first file using the first processing engine and indicate to continue processing using a second processing engine, and storing in memory information obtained based on processing the first file by one or more of the first processing engine and the second processing engine.

    K-D TREE BALANCED SPLITTING
    86.
    发明公开

    公开(公告)号:US20230359602A1

    公开(公告)日:2023-11-09

    申请号:US17738609

    申请日:2022-05-06

    CPC classification number: G06F16/2246

    Abstract: A system for clustering data into corresponding files comprises one or more processors and a memory. The one or more processors is/are configured to: 1) determine to cluster a set of data into a set of files; 2) determine a set of split points in a corresponding set of dimensions of the set of data to determine the set of files, wherein each file of the set of files has an approximate target size; and 3) store one or more items of the set of data into a corresponding file of the set of files based at least in part on the set of split points. The memory is coupled to the one or more processors and configured to provide the processor with instructions.

    Update and query of a large collection of files that represent a single dataset stored on a blob store

    公开(公告)号:US11775499B2

    公开(公告)日:2023-10-03

    申请号:US17695411

    申请日:2022-03-15

    CPC classification number: G06F16/2358 G06F16/148 G06F16/2282

    Abstract: A system includes an interface and a processor. The interface is configured to receive a table indication of a data table and to receive a transaction indication to perform a transaction. The processor is configured to determine a current position N in a transaction log; determine a current state of the metadata; determine a read set associated with a transaction; attempt to write an update to the transaction log associated with a next position N+1; in response to a transaction determination that a simultaneous transaction associated with the next position N+1 already exists, determine a set of updated files; and in response to a determination that there is not an overlap between the read set associated with the current transaction and the set of updated files associated with the simultaneous transaction, attempt to write the update to the transaction to the transaction log associated with a further position N+2.

    Hash based rollup with passthrough
    89.
    发明授权

    公开(公告)号:US11675767B1

    公开(公告)日:2023-06-13

    申请号:US17099467

    申请日:2020-11-16

    Abstract: A system includes a plurality of computing units. A first computing unit of the plurality of computing units comprises: a communication interface configured to receive an indication to roll up data in a data table; and a processor coupled to the communication interface and configured to: build a preaggregation hash table based at least in part on a set of columns and the data table by aggregating input rows of the data table; for each preaggregated hash table entry of the preaggregated hash table: provide the preaggregated hash table entry to a second computing unit of the plurality of computing units based at least in part on a distribution hash value; receive a set of received entries from computing units of the plurality of computing units; and build an aggregation hash table based at least in part on the set of received entries by aggregating the set of received entries.

    Model ML Registry and Model Serving
    90.
    发明公开

    公开(公告)号:US20230177031A1

    公开(公告)日:2023-06-08

    申请号:US18162579

    申请日:2023-01-31

    CPC classification number: G06F16/219 G06F16/955 G06N5/022

    Abstract: A system includes an interface, a processor, and a memory. The interface is configured to receive a version of a model from a model registry. The processor is configured to store the version of the model, start a process running the version of the model, and update a proxy with version information associated with the version of the model, wherein the updated proxy indicates to redirect an indication to invoke the version of the model to the process. The memory is coupled to the processor and configured to provide the processor with instructions.

Patent Agency Ranking