Dictionary Filtering and Evaluation in Columnar Databases

    公开(公告)号:US20240256550A1

    公开(公告)日:2024-08-01

    申请号:US18162616

    申请日:2023-01-31

    申请人: Databricks, Inc.

    摘要: Disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. The method includes receiving a request to perform a query comprising at least a operator and a request to return information about a value of interest in a columnar dataset stored on cloud storage. At least one column in the columnar dataset is based on a dictionary. The dictionary maps one or more values for a column to one or more respective identifiers. The method determines whether to perform dictionary filtering for the query by calculating a metric based on one or more factors. Responsive to the metric being below a threshold, which may be predetermined, the method performs the dictionary filtering.

    Evaluating Expressions Over Dictionary Data
    22.
    发明公开

    公开(公告)号:US20240256549A1

    公开(公告)日:2024-08-01

    申请号:US18162607

    申请日:2023-01-31

    申请人: Databricks, Inc.

    摘要: Disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. The method includes receiving a request to perform a query comprising at least an operator for a columnar dataset on cloud storage. At least one column in the dataset is based on a dictionary, and the dictionary maps one or more values for a column to one or more respective identifiers. The method evaluates the operator on one or more values of the dictionary to generate an updated dictionary comprising updated values. The method may decode the updated dictionary into an updated column comprising updated data values.

    FINE-GRAINED DECISION ON PROPAGATION OF REVALIDATION

    公开(公告)号:US20240232165A9

    公开(公告)日:2024-07-11

    申请号:US17973440

    申请日:2022-10-25

    申请人: SAP SE

    IPC分类号: G06F16/23 G06F11/34 G06F16/21

    摘要: Various systems and methods for selective revalidation of data objects are provided. In one example, a computer-implemented method includes updating a target data object of a database system according to a definition statement, and determining whether the definition statement changes one or more object properties of the target data object. In response to determining that the definition statement changes the one or more object properties of the target data object, the method includes revalidating data objects depending on the target data object. In response to determining that the definition statement does not change the one or more object properties of the target data object, the method includes not revalidating the data objects depending on the target data object. In this way, database management performance and speed may be improved while maintaining validity of data objects in a database.

    Resolving Capacity Recovery Across Multiple Components of a Storage System

    公开(公告)号:US20240232042A1

    公开(公告)日:2024-07-11

    申请号:US18152812

    申请日:2023-01-11

    IPC分类号: G06F11/34 G06F11/30

    摘要: Workload from a host or a set of hosts is directed to a set of storage volumes that are formed from storage resources that are grouped together in a storage group on a storage system. The workload on the storage group impacts many components of the storage system, including front-end ports and directors, shared global memory, back-end ports and directors, and back-end storage resources. The workload may also affect systems applications such as remote data forwarding (RDF) applications that also consume storage system resources such as RDF ports and directors and shared global memory. A workload planner characterizes workloads on the storage groups and overall workloads on components of the storage system, and contains control logic configured to resolve capacity recovery across multiple components of a storage system in connection with simulated removal of a storage group from the storage system.

    APPLICATION EXECUTION ALLOCATION USING MACHINE LEARNING

    公开(公告)号:US20240232039A1

    公开(公告)日:2024-07-11

    申请号:US18094028

    申请日:2023-01-06

    发明人: Ronald N. Isaac

    IPC分类号: G06F11/34 G06N20/00

    CPC分类号: G06F11/3409 G06N20/00

    摘要: Apparatuses, systems, and techniques for assigning execution of applications to various processing units using machine learning are disclosed herein. Usage data for an application to be executed using a computing system including an integrated processing unit and a discrete processing unit is identified. At least a portion of operations of the application to be executed using the integrated processing unit or the discrete processing unit based on the usage data and in view of at least one of one or more system performance metrics or one or more user experience metrics associated with executing the application using the integrated processing unit and the discrete processing unit.