Abstract:
In accordance with an embodiment, described herein are systems and methods for providing direct access to a sharded database. A shard director provides access by software client applications to database shards. A connection pool (e.g., a Universal Connection Pool, UCP) and database driver (e.g., a Java Database Connectivity, JDBC, component) can be configured to allow a client application to provide a shard key, either during connection checkout or at a later time; recognize shard keys specified by the client application; and enable connection by the client application to a particular shard or chunk. The approach enables efficient re-use of connection resources, and faster access to appropriate shards.
Abstract:
In accordance with an embodiment, the system enables access to a sharded database using a cache and a shard topology. A shard-aware client application connecting to a sharded database can use a connection pool (e.g., a Universal Connection Pool, UCP), to store or access connections to different shards or chunks of the sharded database within a shared pool. As new connections are created, a shard topology layer can be built at the database driver layer, which learns and caches shard key ranges to locations of shards. The shard topology layer enables subsequent connection requests from a client application to use a fast key path access to the appropriate shard or chunk.
Abstract:
A system and method is described for database split generation in a massively parallel or distributed database environment including a plurality of databases and a data warehouse layer providing data summarization and querying functionality. A database table accessor of the system obtains, from an associated client application, a query for data in a table of the data warehouse layer, wherein the query includes a user preference. The system obtains table data representative of properties of the table, and determines a splits generator in accordance with one or more of the user preference or the properties of the table. The system generates, by the selected splits generator, table splits dividing the user query into a plurality of query splits, and outputs the plurality of query splits to an associated plurality of mappers for execution by the associated plurality of mappers of each of the plurality of query splits against the table.
Abstract:
A system and method for transparent multi key-value weighted attributed connection using uni-tag connection pools. In accordance with an embodiment, a connection pool enables labeling of connections that software applications can use to access a database. A connection pool associated with a database enables tagging of connection pools at the database and allows applications to selectively obtain connections based on tags. A request is received from an application to query data from the database using a labeled connection or low-cost alternative. If a low-cost connection is found, but requires configuration, the system returns unmatched labels for use by the application in configuring its environment to use the connection. The system can also generate a tag for the connection. Upon subsequent release of the database session, the tag can be made available for subsequent use of the tag, or a tagged connection, by the same or by other applications.
Abstract:
A system and method is described for database split generation in a massively parallel or other distributed database environment including a plurality of databases and a data warehouse layer providing data summarization and querying functionality. A database table accessor of the system obtains, from an associated client application, a query for data in a table of the data warehouse layer, wherein the query includes a user preference. The system obtains table data representative of properties of the table, and determines a splits generator in accordance with one or more of the user preference or the properties of the table. The system generates, by the selected splits generator, table splits dividing the user query into a plurality of query splits, and outputs the plurality of query splits to an associated plurality of mappers for execution by the associated plurality of mappers of each of the plurality of query splits against the table.
Abstract:
A system and method is described for database split generation in a massively parallel or other distributed database environment including a plurality of databases and a data warehouse layer providing data summarization and querying functionality. A database table accessor of the system obtains, from an associated client application, a query for data in a table of the data warehouse layer, wherein the query includes a user preference. The system obtains table data representative of properties of the table, and determines a splits generator in accordance with one or more of the user preference or the properties of the table. The system generates, by the selected splits generator, table splits dividing the user query into a plurality of query splits, and outputs the plurality of query splits to an associated plurality of mappers for execution by the associated plurality of mappers of each of the plurality of query splits against the table.
Abstract:
A system and method is described for database split generation in a massively parallel or distributed database environment including a plurality of databases and a data warehouse layer providing data summarization and querying functionality. A database table accessor of the system obtains, from an associated client application, a query for data in a table of the data warehouse layer, wherein the query includes a user preference. The system obtains table data representative of properties of the table, and determines a splits generator in accordance with one or more of the user preference or the properties of the table. The system generates, by the selected splits generator, table splits dividing the user query into a plurality of query splits, and outputs the plurality of query splits to an associated plurality of mappers for execution by the associated plurality of mappers of each of the plurality of query splits against the table.
Abstract:
A system and method is described for database split generation in a massively parallel or distributed database environment including a plurality of databases and a data warehouse layer providing data summarization and querying functionality. A database table accessor of the system obtains, from an associated client application, a query for data in a table of the data warehouse layer, wherein the query includes a user preference. The system obtains table data representative of properties of the table, and determines a splits generator in accordance with one or more of the user preference or the properties of the table. The system generates, by the selected splits generator, table splits dividing the user query into a plurality of query splits, and outputs the plurality of query splits to an associated plurality of mappers for execution by the associated plurality of mappers of each of the plurality of query splits against the table.
Abstract:
A system and method for marshaling database data from a native interface layer, to a Java layer, using a linear array. In accordance with an embodiment, a request is received from a software application to query or access data stored at the database. At a database driver native interface layer, the system obtains cell data from the database, determines cell coordinates and a cell metadata, and linearizes the cell data if required. The linearized data is then flushed to a linear byte array in the database driver presentation layer, and the cell coordinates and cell metadata are provided for use by a compact data handler and the application in accessing the data.
Abstract:
In accordance with an embodiment, the system enables access to a sharded database using a cache and a shard topology. A shard-aware client application connecting to a sharded database can use a connection pool (e.g., a Universal Connection Pool, UCP), to store or access connections to different shards or chunks of the sharded database within a shared pool. As new connections are created, a shard topology layer can be built at the database driver layer, which learns and caches shard key ranges to locations of shards. The shard topology layer enables subsequent connection requests from a client application to use a fast key path access to the appropriate shard or chunk.