Scaling delta table optimize command

    公开(公告)号:US12079167B1

    公开(公告)日:2024-09-03

    申请号:US18093916

    申请日:2023-01-06

    CPC classification number: G06F16/172 G06F16/2282

    Abstract: The interface is to receive an indication to execute an optimize command. The processor is to receive a file name; determine whether adding a file of the file name to a current bin causes the current bin to exceed a threshold; associate the file with the current bin in response to determining that adding the file does not cause the current bin to exceed the bin threshold; in response to determining that adding the file to the current bin causes the current bin to exceed the bin threshold: associate the file with a next bin, indicate that the current bin is closed, and add the current bin to a batch of bins; determine whether a measure of the batch of bins exceeds a batch threshold; and in response to determining that the measure exceeds the batch threshold, provide the batch of bins for processing.

    STRUCTURED CLUSTER EXECUTION FOR DATA STREAMS

    公开(公告)号:US20230141556A1

    公开(公告)日:2023-05-11

    申请号:US17976361

    申请日:2022-10-28

    CPC classification number: G06F16/24542 G06F16/24568

    Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

    Structured cluster execution for data streams

    公开(公告)号:US11514045B2

    公开(公告)日:2022-11-29

    申请号:US16721402

    申请日:2019-12-19

    Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

    STRUCTURED CLUSTER EXECUTION FOR DATA STREAMS

    公开(公告)号:US20200257689A1

    公开(公告)日:2020-08-13

    申请号:US16721402

    申请日:2019-12-19

    Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

    State rebalancing in structured streaming

    公开(公告)号:US12099525B2

    公开(公告)日:2024-09-24

    申请号:US18219314

    申请日:2023-07-07

    CPC classification number: G06F16/278 G06F16/24568

    Abstract: A data processing service performs a rebalancing process for rebalancing stateful tasks on a cluster computing system. In one instance, the method for rebalancing stateful tasks is performed such that the per-operator partitions are spread across available executors of a cluster of the cluster computing system with respect to one or more statistics of the tasks. In one instance, the method for rebalancing stateful tasks is also performed such that the total number of stateful tasks are balanced per executor as long as this rebalancing does not imbalance the per-operator placements. In this way, the processing of stateful tasks can be spread across multiple executors in a relatively uniform manner, even though there may be an upfront cost of breaking the local caching on an executor.

    Structured cluster execution for data streams

    公开(公告)号:US12032573B2

    公开(公告)日:2024-07-09

    申请号:US17976361

    申请日:2022-10-28

    CPC classification number: G06F16/24542 G06F16/24568

    Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

    Structured cluster execution for data streams

    公开(公告)号:US10558664B2

    公开(公告)日:2020-02-11

    申请号:US15581647

    申请日:2017-04-28

    Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

Patent Agency Ranking