Systems and methods for spilling data for hash joins

    公开(公告)号:US11550793B1

    公开(公告)日:2023-01-10

    申请号:US17721599

    申请日:2022-04-15

    Applicant: Snowflake Inc.

    Abstract: Systems and methods for spilling data for hash joins are described. An example method includes determining an amount of available space in a first memory used by a set of relational queries is insufficient for a first relational join query. The first relational join query comprises a join operation. The method also includes determining a set of build memory sizes and a set of probe memory sizes for a set of partitions for the set of relational queries. The method further includes identifying a first partition of the set of partitions based on the set of probe memory sizes and the set of build memory sizes. The method further includes copying the first partition from the first memory to a second memory, wherein the first partition comprises a first build portion and a first probe portion.

    CONFIGURING PARALLELISM PARAMETERS FOR INVOCATION OF EXTERNAL TABLE FUNCTIONS

    公开(公告)号:US20220414094A1

    公开(公告)日:2022-12-29

    申请号:US17823132

    申请日:2022-08-30

    Applicant: Snowflake Inc.

    Abstract: A query referencing an external table function provided by a remote software component is received. Requests to execute the external table function on input data are sent to a proxy service. A first request includes a batch of input rows from the input data. A first response to the first request received from the proxy service includes a first portion of result data and a pagination token. The pagination token indicates that at least a second portion of the result data corresponding to the first batch of input rows is to be obtained from the remote software component. Based on the pagination token, a second request is sent to obtain the second portion of the result data. One or more responses are received from the proxy service that comprise at least the second portion of the result data. The result data is processed according to the query.

    CUTOFFS FOR PRUNING OF DATABASE QUERIES

    公开(公告)号:US20220405285A1

    公开(公告)日:2022-12-22

    申请号:US17822264

    申请日:2022-08-25

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives, during a query compilation process, a query directed to a set of source tables. The subject technology performs, during the query compilation process, a modification of the query for adjusting at least one pruning operation. The subject technology determines, during a pruning process of a second query, the second query directed to a set of files in a database system and including a set of pruning operations on the set of files, whether to perform a pruning cutoff on the set of pruning operations, the pruning process performing a depth first search of a pruner tree structure, the set of files comprising a set of micro-partitions. The subject technology performs the pruning cutoff based on the determining, the pruning cutoff ceasing at least one pruning operation from the set of pruning operations.

    CONCURRENCY CONTROL FOR TRANSACTIONS IN DATABASE SYSTEMS

    公开(公告)号:US20220405266A1

    公开(公告)日:2022-12-22

    申请号:US17821581

    申请日:2022-08-23

    Applicant: Snowflake Inc.

    Abstract: The subject technology inserts, by a first transaction, a new version of an object, the first transaction including a first statement to perform an update operation to a row in a first table, the object corresponding to data in the row to be updated, the first statement including information comprising an object key associated with the object. The subject technology performs, by a second transaction, a range read, the range read including information indicating the object key. The subject technology receives a set of conflicting transactions from the range read. The subject technology determines that a conflict occurred between the first transaction and a third transaction from the set of conflicting transactions. The subject technology performs a restart of the first transaction in response to determining that the conflict occurred.

    Accessing listings in a data exchange

    公开(公告)号:US11531681B2

    公开(公告)日:2022-12-20

    申请号:US17704783

    申请日:2022-03-25

    Applicant: Snowflake Inc.

    Abstract: A method for accessing listings in a data exchange includes creating a first listing in a data exchange, the first listing referencing a first database of a plurality of databases and specifying identity-based sharing of the first database, creating a second listing in the data exchange, the second listing referencing a second database of the plurality of databases and data of the first database shared according to the identity-based sharing of the first database, and receiving an instruction from a user of the data exchange, the instruction referencing the second listing and instructing the addition of the second listing to a set of consumed data shares accessible by the user.

    STAGE REPLICATION IN A CLOUD DATA LAKE

    公开(公告)号:US20220391408A1

    公开(公告)日:2022-12-08

    申请号:US17396576

    申请日:2021-08-06

    Applicant: Snowflake Inc.

    Abstract: The embodiments described herein provide means for replicating external stages between deployments of e.g., a cloud data lake using a modified storage integration. The modified storage integration may be defined with multiple storage locations that it can point to, as well as a designation of an active storage location. The storage integration may also be defined with base file paths for each storage location as well as a relative file path which together may serve to synchronize data loading operations between deployments when e.g., a fail-over occurs from one deployment to another. The storage integration may be replicated from a first deployment to a second deployment, and when database replication occurs, an external stage may be replicated to the second deployment and bound to the replicated storage integration. Thus, a fail-over to the second deployment may result in a seamless transition of data loading processes to the second deployment.

    Share object discovery techniques
    500.
    发明授权

    公开(公告)号:US11520920B1

    公开(公告)日:2022-12-06

    申请号:US17580341

    申请日:2022-01-20

    Applicant: Snowflake Inc.

    Abstract: Embodiments of the present disclosure provide an enhanced method of discovering shared objects that utilizes share authorization in addition to role authorization when a role is attempting to discover shared objects. A consumer account may invoke an operation referencing shared objects within a provider account using an imported database as a current session database. In response, a call context of the operation may be updated to save the imported database as a current session database and the imported database may be mapped to a first share and to a shared database. A first authorization based on whether the role has access privileges to the shared objects may be performed. The shared database may be used to identify schemas and the schemas may be used to identify shares associated with the imported database. A secondary authorization may be performed based on permissions that the shares associated with the imported database have on the shared objects.

Patent Agency Ranking