Abstract:
Techniques are provided for managing cached data objects in a mixed workload environment. In an embodiment, a database system receives request to access a target data object. The database system determines whether the request to access the target data object is associated with a first type of workload or a second type of workload. In response to determining that the request is associated with the first type of workload, the target data object replaces a least recently used data object in a cache. In response to determining that the request is associated with the second type of workload, the target data object is cached based on an associated access-level value.
Abstract:
A storage system communicatively coupled to a database management system (DBMS performs storage-side scanning of data sources that are not stored in native database storage format of the DBMS. Data sources for external tables are accessible in a storage system referred to as a distributed data access system (DDAS), e.g. a Hadoop Distributed File System. To execute a query that references an external table, a DBMS first generates an execution plan. The DDAS supplies the DBMS with information that specifies each portion of the data source, and specifies which data node to use to access the portion. The DBMS sends a request for each portion to the respective data node, requesting that the data node generate rows from data in the portion. The request may specify scanning criteria, specifying one or more columns to project and/or filter on, and code modules for the data node to execute to generate records.
Abstract:
A storage system communicatively coupled to a DBMS performs storage-side scanning of data sources that are not stored in the native database storage format of the DBMS. Data sources for external tables are accessible in a storage system referred to herein as a distributed data access system, e.g. a Hadoop Distributed File System. To execute a query that references an external table, a DBMS first generates an execution plan. The distributed data access system supplies the DBMS with information that specifies each portion of the data source, and specifies which data node to use to access the portion. The DBMS sends a request for each portion to the respective data node, the request requesting that the data node generate rows from data in the portion. The request may specify scanning criteria, specifying one or more columns to project and/or filter on. The request may also specify code modules for the data node to execute to generate rows or records and columns.
Abstract:
Consistent External Table Access maintains transactional consistency for queries that access external tables stored in a DBFS. This ability is achieved by bypassing the OS. One or more database processes executing a query that access an external table stored in a DBFS access the database-file table like other database tables in the DBMS that can be accessed to execute a query. Based on metadata stored in the DBMS regarding how an external table is stored in a DBFS, a DBMS is able to marshal database processes that access database-file tables directly to execute a query.
Abstract:
Techniques are provided for managing cached data objects in a mixed workload environment. In an embodiment, a database system receives request to access a target data object. The database system determines whether the request to access the target data object is associated with a first type of workload or a second type of workload. In response to determining that the request is associated with the first type of workload, the target data object replaces a least recently used data object in a cache. In response to determining that the request is associated with the second type of workload, the target data object is cached based on an associated access-level value.