Patent search aee:"Databricks Inc." Page 14

131.

发明授权
LIFO based spilling for grouping aggregation 有权

公开(公告)号：US11481398B1

公开(公告)日：2022-10-25

申请号：US17116230

申请日：2020-12-09

Applicant: Databricks Inc.

Inventor： Alexander Behm , Ankur Dave , Ryan Deng , Shoumik Palkar

IPC: G06F16/2455 , G06F16/22

Abstract: A system for spilling comprises an interface and a processor. The interface is configured to receive an indication to perform a GROUP BY operation, wherein the indication comprises an input table and a grouping column. The processor is configured to: for each input table entry of the input table, determine a key, wherein the key is based at least in part on the input table entry and the grouping column; add the key to a grouping hash table, wherein adding the key to the grouping hash table comprises last-in, first-out (LIFO) spilling when necessary; create an output table based at least in part on the grouping hash table; and provide the output table.

132.

发明授权
Automated processing of multiple prediction generation including model tuning 有权

公开(公告)号：US11468369B1

公开(公告)日：2022-10-11

申请号：US17587806

申请日：2022-01-28

Applicant: Databricks Inc.

Inventor： Benjamin Thomas Wilson , Corey Zumar

IPC: G06N20/00 , G06K9/62

Abstract: The present application discloses a method, system, and computer system for building a model associated with a dataset. The method includes receiving a data set, the dataset comprising a plurality of keys and a plurality of key-value relationships, determining a plurality of models to build based at least in part on the dataset, wherein determining the plurality of models to build comprises using the dataset format information to identify the plurality of models, building the plurality of models, and optimizing at least one of the plurality of models.

133.

发明申请
DATAFLOW GRAPH PROCESSING WITH EXPECTATIONS 有权

公开(公告)号：US20220309104A1

公开(公告)日：2022-09-29

申请号：US17362456

申请日：2021-06-29

Applicant: Databricks Inc.

Inventor： Michael Paul Armbrust , Andreas Neumann , Mukul Murthy , Jonathan Mio

IPC: G06F16/901 , G06F16/215 , G06F16/245

Abstract: A system for dataflow graph processing comprises a communication interface and a processor. The communication interface is configured receive an indication to generate a dataflow graph, wherein the indication includes a set of queries. The processor is coupled to the communication interface and is configured to: determine dependencies of each query in the set of queries on another query; determine a DAG of nodes based at least in part on the dependencies; insert a node in the DAG of nodes to generate an updated DAG to enforce an expectation; determine a dataflow graph based on the updated DAG; and provide the dataflow graph.

134.

发明授权
Autoscaling using file access or cache usage for cluster machines 有权

公开(公告)号：US11379272B2

公开(公告)日：2022-07-05

申请号：US17020573

申请日：2020-09-14

Applicant: Databricks Inc.

Inventor： Srinath Shankar , Eric Keng-Hao Liang

IPC: G06F9/46 , G06F9/50 , G06F9/48 , G06F9/38 , H04L67/5682

Abstract: The allocation system comprises an interface and a processor. The interface is configured to receive an indication to deactivate idle cluster machines of a set of cluster machines. The processor is configured to determine a list of cluster machines storing one or more intermediate data files of a set of intermediate data files; determine a set of idle cluster machines of the set of cluster machines that are neither running one or more tasks of a set of tasks executing or pending on the set of cluster machines nor storing the one or more intermediate data files of the set of intermediate data files, where the set of intermediate data files is associated with the set of tasks executing or pending on the cluster machines; and deactivate each cluster machine of the set of idle cluster machines.

135.

发明申请
INTEGRATED NATIVE VECTORIZED ENGINE FOR COMPUTATION 有权

公开(公告)号：US20220100761A1

公开(公告)日：2022-03-31

申请号：US17237979

申请日：2021-04-22

Applicant: Databricks Inc.

Inventor： Shi Xin , Alexander Behm , Shoumik Palkar , Herman Rudolf Petrus Catharina van Hövell tot Westerflier

IPC: G06F16/2453 , G06F16/25 , G06F16/2458

Abstract: A system comprises an interface, a processor, and a memory. The interface is configured to receive a query. The processor is configured to: determine a set of nodes for the query; determine whether a node of the set of nodes comprises a first engine node type or a second engine node type, wherein determining whether the node of the set of nodes comprises the first engine node type or the second engine node type is based at least in part on determining whether the node is able to be executed in a second engine; and generate a plan based at least in part on the set of nodes. The memory is coupled to the processor and is configured to provide the processor with instructions.

136.

发明申请
UPDATE AND QUERY OF A LARGE COLLECTION OF FILES THAT REPRESENT A SINGLE DATASET STORED ON A BLOB STORE 有权

公开(公告)号：US20210011901A1

公开(公告)日：2021-01-14

申请号：US16941227

申请日：2020-07-28

Applicant: Databricks Inc.

Inventor： Michael Paul Armbrust , Shixiong Zhu , Burak Yavuz

IPC: G06F16/23 , G06F16/14 , G06F16/22

Abstract: A system includes an interface and a processor. The interface is configured to receive a table indication of a data table and to receive a transaction indication to perform a transaction. The processor is configured to determine a current position N in a transaction log; determine a current state of the metadata; determine a read set associated with a transaction; attempt to write an update to the transaction log associated with a next position N+1; in response to a transaction determination that a simultaneous transaction associated with the next position N+1 already exists, determine a set of updated files; and in response to a determination that there is not an overlap between the read set associated with the current transaction and the set of updated files associated with the simultaneous transaction, attempt to write the update to the transaction to the transaction log associated with a further position N+2.

137.

发明授权
Callable notebook for cluster execution 有权

公开(公告)号：US10678536B2

公开(公告)日：2020-06-09

申请号：US16378353

申请日：2019-04-08

Applicant: Databricks Inc.

Inventor： Timothee Hunter , Ali Ghodsi , Ion Stoica

IPC: G06F8/71 , G06F8/54 , G06F9/445 , G06F16/9535 , G06F9/455 , G06F9/50

Abstract: A system for processing a notebook includes an input interface and a processor. The input interface is to receive a first notebook. The notebook comprises code for interactively querying and viewing data. The processor is to load the first notebook into a shell. The shell receives one or more parameters associated with the first notebook. The shell executes the first notebook using a cluster.

138.

发明授权
Structured cluster execution for data streams 有权

公开(公告)号：US10558664B2

公开(公告)日：2020-02-11

申请号：US15581647

申请日：2017-04-28

Applicant: Databricks Inc.

Inventor： Michael Armbrust , Tathagata Das , Shi Xin , Matei Zaharia

IPC: G06F16/2453 , G06F16/2455

Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

139.

发明授权
Multiple display views for a notebook 有权

公开(公告)号：US10474736B1

公开(公告)日：2019-11-12

申请号：US14979253

申请日：2015-12-22

Applicant: Databricks Inc.

Inventor： Ion Stoica , Ali Ghodsi , Chaoyu Yang

IPC: G06F17/21 , G06F17/24 , G06F17/22 , G06F3/0481

Abstract: A system for multiple views for a notebook includes an input interface and a processor. The input interface to receive a notebook. The processor is to load the notebook into a shell, wherein the shell executes the notebook using a cluster, to receive an indication to view a dashboard associated with the notebook, and to provide dashboard display information. The dashboard includes a page layout display.

140.

发明授权
Serverless execution of code using cluster resources 有权

公开(公告)号：US10474501B2

公开(公告)日：2019-11-12

申请号：US15581987

申请日：2017-04-28

Applicant: Databricks Inc.

Inventor： Ali Ghodsi , Srinath Shankar , Sameer Paranjpye , Shi Xin , Matei Zaharia

IPC: G06F9/50

Abstract: A system for cluster resource allocation includes an interface and a processor. The interface is configured to receive a process and input data. The processor is configured to determine an estimate for resources required for the process to process the input data; determine existing available resources in a cluster for running the process; determine whether the existing available resources are sufficient for running the process; in the event it is determined that the existing available resources are not sufficient for running the process, indicate to add new resources; determine an allocated share of resources in the cluster for running the process; and cause execution of the process using the share of resources.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification