Patent search caee:"Databricks Inc." Page 10

91.

发明申请
DATAFLOW GRAPH PROCESSING 有权

公开(公告)号：US20220309103A1

公开(公告)日：2022-09-29

申请号：US17362450

申请日：2021-06-29

Applicant: Databricks Inc.

Inventor： Michael Paul Armbrust , Andreas Neumann , Mukul Murthy , Jonathan Mio

IPC: G06F16/901 , G06F16/22 , G06F16/245

Abstract: A system for dataflow graph processing comprises a communication interface and a processor. The communication interface is configured receive an indication to generate a dataflow graph, wherein the indication includes a set of queries and/or commands. The processor is coupled to the communication interface and configured to: determine dependencies of each query in the set of queries on another query; determine a DAG of nodes based at least in part on the dependencies; determine the dataflow graph by determining in-line expressions for tables of the dataflow graph aggregating calculations associated with a subset of dataflow graph nodes designated as view nodes; and provide the dataflow graph.

92.

发明授权
Split front end for flexible back end cluster processing 有权

公开(公告)号：US11113043B2

公开(公告)日：2021-09-07

申请号：US16864074

申请日：2020-04-30

Applicant: Databricks Inc.

Inventor： Srinath Shankar , Eric Keng-Hao Liang , Gregory George Owen

IPC: G06F8/41 , G06F8/54 , G06F8/70 , G06F11/36 , G06F11/07 , G06F21/62 , G06F16/23 , G06F16/907

Abstract: A system for code development and execution includes a client interface and a client processor. The client interface is configured to receive user code for execution and receive an indication of a server that will perform the execution. The client processor is configured to parse the user code to identify one or more data items referred to during the execution. The client processor is also configured to provide the server with an inquiry for metadata regarding the one or more data items, receive the metadata regarding the one or more data items, determine a logical plan based at least in part on the metadata regarding the one or more data items; and provide the logical plan to the server for execution.

93.

发明授权
Directory level atomic commit protocol 有权

公开(公告)号：US11068447B2

公开(公告)日：2021-07-20

申请号：US15487896

申请日：2017-04-14

Applicant: Databricks Inc.

Inventor： Eric Keng-hao Liang , Srinath Shankar , Shi Xin

IPC: G06F16/18 , G06F16/16 , G06F16/182

Abstract: A system for directory level atomic commits includes an interface and a processor. The interface is configured to receive an indication to provide a set of files. The processor is configured to determine whether a file in a directory has been either 1) atomically committed or 2) written by a non-atomic process and not designated as deleted and provide the file as one file of the set of files in the event that the file in the directory has been either 1) atomically committed or 2) written by a non-atomic process and not designated as deleted.

94.

发明授权
Autoscaling using file access or cache usage for cluster machines 有权

公开(公告)号：US10810051B1

公开(公告)日：2020-10-20

申请号：US16188989

申请日：2018-11-13

Applicant: Databricks Inc.

Inventor： Srinath Shankar , Eric Keng-Hao Liang

IPC: G06F9/46 , G06F9/50 , G06F9/48 , G06F9/38 , H04L29/08

Abstract: The allocation system comprises an interface and a processor. The interface is configured to receive an indication to deactivate idle cluster machines of a set of cluster machines. The processor is configured to determine a set of tasks executing or pending on the set of cluster machines; determine a set of idle cluster machines of the set of cluster machines that are neither running one or more tasks of the set of tasks nor storing one or more intermediate data files of a set of intermediate data files, where the set of intermediate data files is associated with a set of tasks executing or pending on the cluster machines; and deactivate each cluster machine of the set of idle cluster machines.

95.

发明申请
CONCURRENT OPTIMISTIC TRANSACTIONS FOR TABLES WITH DELETION VECTORS 有权

公开(公告)号：US20250103580A1

公开(公告)日：2025-03-27

申请号：US18928982

申请日：2024-10-28

Applicant: Databricks, Inc.

Inventor： Bart Samwel , Christos Stavrakakis

IPC: G06F16/23

Abstract: A disclosed configuration receives a first indication that a first transaction is committed to update a first subset of records in a data table at a first version to generate a second version of the data table and receiving a second indication to commit a second transaction to update a second subset of records in a data file of the data table at the first version. The configuration determines a logical prerequisite based on whether the first subset of records changes content of one or more records in the second subset of records and determining a physical prerequisite on whether the second subset of records corresponds to respective data records in data files of the second version of the data table. The configuration commits the second transaction to generate a third version of the data table by updating elements of the deletion vector if the prerequisites are satisfied.

96.

发明授权
Reducing cluster start up time 有权

公开(公告)号：US12248818B1

公开(公告)日：2025-03-11

申请号：US17514988

申请日：2021-10-29

Applicant: Databricks, Inc.

Inventor： Yandong Mao , Aaron Daniel Davidson

IPC: G06F9/50 , G06F21/45

Abstract: The present application discloses a method, system, and computer system for starting up and maintaining a cluster in a warmed up state, and/or allocating clusters from a warmed up state. The method includes instantiating a set of virtual machines, wherein instantiating the set of virtual machines includes setting a temporary security credential for each virtual machine of the set of virtual machines, receiving a virtual machine allocation request associated with a workspace, a customer, or a tenant, in response to the virtual machine allocation request: allocating a virtual machine, wherein allocating the virtual machine comprises replacing the temporary security credential with a security credential associated with the workspace, the customer, or the tenant.

97.

发明授权
Dictionary filtering and evaluation in columnar databases 有权

公开(公告)号：US12242485B2

公开(公告)日：2025-03-04

申请号：US18162616

申请日：2023-01-31

Applicant: Databricks, Inc.

Inventor： Utkarsh Agarwal , Shoumik Palkar , Alexander Behm , Sriram Krishnamurthy

IPC: G06F16/24 , G06F11/34 , G06F16/22 , G06F16/2455

Abstract: Disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. The method includes receiving a request to perform a query comprising at least a operator and a request to return information about a value of interest in a columnar dataset stored on cloud storage. At least one column in the columnar dataset is based on a dictionary. The dictionary maps one or more values for a column to one or more respective identifiers. The method determines whether to perform dictionary filtering for the query by calculating a metric based on one or more factors. Responsive to the metric being below a threshold, which may be predetermined, the method performs the dictionary filtering.

98.

发明授权
Data lineage tracking 有权

公开(公告)号：US12242441B1

公开(公告)日：2025-03-04

申请号：US18162562

申请日：2023-01-31

Applicant: Databricks, Inc.

Inventor： Tao Feng , Menglei Sun , Zhuoying Wang

IPC: G06F16/28 , G06F11/07 , G06F16/215 , G06F16/22 , G06F16/23 , G06F16/906 , G06F17/18

Abstract: The present application discloses a method, system, and computer system for managing lineage data for data entities. The method includes generating lineage data, wherein generating the lineage data, and storing and indexing, in a data structure, the lineage data in association with the selected data entity. The generating the lineage data includes selecting a selected data entity, obtaining a query tree that was used to generate the selected data entity, and determining lineage data for the selected data entity based at least in part on the query tree.

99.

发明授权
Evaluating expressions over dictionary data 有权

公开(公告)号：US12210528B2

公开(公告)日：2025-01-28

申请号：US18162607

申请日：2023-01-31

Applicant: Databricks, Inc.

Inventor： Utkarsh Agarwal , Shoumik Palkar , Alexander Behm , Sriram Krishnamurthy

IPC: G06F16/2455 , G06F11/34 , G06F16/22

Abstract: Disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. The method includes receiving a request to perform a query comprising at least an operator for a columnar dataset on cloud storage. At least one column in the dataset is based on a dictionary, and the dictionary maps one or more values for a column to one or more respective identifiers. The method evaluates the operator on one or more values of the dictionary to generate an updated dictionary comprising updated values. The method may decode the updated dictionary into an updated column comprising updated data values.

100.

发明授权
Short query prioritization for data processing service 有权

公开(公告)号：US12210521B2

公开(公告)日：2025-01-28

申请号：US18140323

申请日：2023-04-27

Applicant: Databricks, Inc.

Inventor： Venkata Sai Akhil Gudesa , Herman Rudolf Petrus Catharina van Hövell tot Westerflier , Supun Chathuranga Nakandala

IPC: G06F16/24 , G06F9/48 , G06F11/34 , G06F16/2453 , G06F16/28

Abstract: A cluster computing system maintains a first set of queues for short queries and a set second set for longer queries. The first set is allocated a majority of the cluster's processing resources and processes queries on a first in first out basis. The second set is allocated a minority of the cluster's processing resources which are shared among queries in the second set. Accordingly, the system assigns each query to the first set of queues for a fixed amount of resource time. While a query is processing, the system monitors the query's resource time and reassigns the query to the second set of queues if the query has not completed within the allotted amount of resource time. Thus, short queries receive the necessary resources to complete quickly without getting stuck behind longer queries while ensuring that longer queries continue to make progress.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification