System for exploring data in a database

    公开(公告)号:US09760602B1

    公开(公告)日:2017-09-12

    申请号:US14621950

    申请日:2015-02-13

    CPC classification number: G06F17/30424 G06F17/30389

    Abstract: A system for exploring data in a database comprises a query parser, a parameter manager, a query submitter, and a result formatter. The query parser is to receive a base query and determine an input parameter from the base query. The parameter manager is to provide a first request for a value for the input parameter; receive the value for the input parameter; and provide a second request for the value for the input parameter. The query submitter is to determine a first query using the base query and the value for the input parameter; and provide an indication to execute the first query. The result formatter is to receive a result associated with the indication to execute the first query.

    Independent data processing environments within a big data cluster system

    公开(公告)号:US09659081B1

    公开(公告)日:2017-05-23

    申请号:US14824989

    申请日:2015-08-12

    CPC classification number: G06F17/30598 G06F9/5033 G06F9/5072 G06F2209/505

    Abstract: A cluster system includes an interface and a processor. The interface is to receive a request from a user associated with one of a plurality of shells. The processor is to determine a plurality of tasks to respond to the request; determine a local set of data and a shared set of data for a task of the plurality of tasks, wherein the local set of data is associated with the one of the plurality of shells; and provide the task, a local set indication, and a shared set indication to a worker associated with the task, wherein the local set indication refers to the local set of data and the shared set indication refers to the shared set of data.

    DATA SHARING FOR NETWORK CONNECTED SYSTEMS

    公开(公告)号:US20250131118A1

    公开(公告)日:2025-04-24

    申请号:US18958728

    申请日:2024-11-25

    Abstract: The present application discloses a method, system, and computer system for providing access to data. The method includes receiving, by a data manager service from a data requesting service, a request using an identifier for a high-level data object to access a set of data associated with the high-level data object, determining, by the data manager service, low-level data object(s) corresponding to the set of data based on the identifier for the high-level data object, determining whether a user associated with the request has permission to access at least a subset of the low-level data object(s), and in response to determining that the user associated has permission to access the at least the subset of the low-level data object(s), generating, by the data manager service, a uniform resource locator (URL) via which the at least the subset of the one or more low-level data objects is accessible by the user.

    USING LLM FUNCTIONS TO EVALUATE AND COMPARE LARGE TEXT OUTPUTS OF LLMS

    公开(公告)号:US20250124236A1

    公开(公告)日:2025-04-17

    申请号:US18518155

    申请日:2023-11-22

    Abstract: A method for evaluating textual output of one or more machine-learned language models is presented. The method includes receiving, from a user of a client device, a first prompt for input to one or more machine-learned language models, providing the first prompt to the one or more models for execution, and receiving a set of generated responses to the first prompt from the one or more models. The method further includes generating a user interface (UI) on the client device displaying the first prompt and generated responses as a table user interface element. The method applies a selected evaluation function to the generated response to evaluate the response with respect to an evaluation objective and identifies words that influence the evaluation. The method generates one or more UI elements on the UI to display the results of the evaluation for the generated responses.

    K-D Tree Balanced Splitting
    67.
    发明申请

    公开(公告)号:US20250086155A1

    公开(公告)日:2025-03-13

    申请号:US18772758

    申请日:2024-07-15

    Abstract: A system for clustering data into corresponding files comprises one or more processors and a memory. The one or more processors is/are configured to: 1) determine to cluster a set of data into a set of files; 2) determine a set of split points in a corresponding set of dimensions of the set of data to determine the set of files, wherein each file of the set of files has an approximate target size; and 3) store one or more items of the set of data into a corresponding file of the set of files based at least in part on the set of split points. The memory is coupled to the one or more processors and configured to provide the processor with instructions.

    Clustering key selection based on machine-learned key selection models for data processing service

    公开(公告)号:US12229169B1

    公开(公告)日:2025-02-18

    申请号:US18501830

    申请日:2023-11-03

    Abstract: The disclosed configurations provide a method (and/or a computer-readable medium or system) for determining, from a table schema describing keys of a data table, one or more clustering keys that can be used to cluster data files of a data table. The method includes generating features for the data table, generating tokens from the features, generating a prediction for each token by applying to the token a machine-learned transformer model trained to predict a likelihood that the key associated with the token is a clustering key for the data table, determining clustering keys based on the predictions, and clustering data records of the data table into data files based on key-values for the clustering keys.

    Checkpoint and restore based startup of executor nodes of a distributed computing engine for processing queries

    公开(公告)号:US12229137B1

    公开(公告)日:2025-02-18

    申请号:US18412438

    申请日:2024-01-12

    Abstract: A system performs efficient startup of executors of a distributed computing engine used for processing queries, for example, database queries. The system starts an executor node and processes a set of queries using the executor node to warm up the executor node. The system performs a checkpoint of the warmed-up executor node to create an image. The image is restored in the target executor nodes. The system may store a checkpoint image for each configuration of an executor node. The configuration is determined based on various factors including the hardware of the executor node, memory allocation of the processes, and so on. The user or restore based on checkpoint images improves efficiency of execution of the startup of executor nodes.

    Multi-cluster query result caching
    70.
    发明授权

    公开(公告)号:US12189625B2

    公开(公告)日:2025-01-07

    申请号:US18222343

    申请日:2023-07-14

    Abstract: A multi-cluster computing system which includes a query result caching system is presented. The multi-cluster computing system may include a data processing service and client devices communicatively coupled over a network. The data processing service may include a control layer and a data layer. The control layer may be configured to receive and process requests from the client devices and manage resources in the data layer. The data layer may be configured to include instances of clusters of computing resources for executing jobs. The data layer may include a data storage system, which further includes a remote query result cache Store. The query result cache store may include a cloud storage query result cache which stores data associated with results of previously executed requests. As such, when a cluster encounters a previously executed request, the cluster may efficiently retrieve the cached result of the request from the in-memory query result cache or the cloud storage query result cache.

Patent Agency Ranking