Abstract:
A system and method for executing a query plan are disclosed. In the system and method, a join graph is generated to represent the query plan. The join graph includes a set of plan operations that are to be executed for implementing the join graph. The query plan is received by a distributed network of a logical index server and one or more selected physical index servers. Each physical index server receives a portion of the plan operations, and determines what plan data is needed to execute the portion of the plan operations. A system and method includes a process for determining what plan data is needed from other physical index servers, or what plan data is needed by other physical index servers.
Abstract:
Methods and apparatus, including computer program products, for compression of tables based on occurrence of values. In general, a number representing an amount of occurrences of a frequently occurring value in a group of adjacent rows of a column is generated, a vector representing whether the frequently occurring value exists in a row of the column is generated, and the number and the vector are stored to enable searches of the data represented by the number and the vector. The vector may omit a portion representing the group of adjacent rows. The values may be dictionary-based compression values representing business data such as business objects. The compression may be performed in-memory, in parallel, to improve memory utilization, network bandwidth consumption, and processing performance.
Abstract:
A method and system for executing an information retrieval query in a multiserver computing environment is disclosed. The method and system employ a technique in which the query is distributed among each of a plurality of partial index servers in the multiserver environment, and a subset of results is calculated for each of the plurality of partial index servers. Then, the subset of results are merged in one logical index server to generate a merged result.
Abstract:
In business systems, one or more methods can be used to reduce an amount of redundant data. In one implementation, a method to reduce redundancy within a data model in a database, in which the data model is represented by at least one table, includes determining a number of distinct values of partial keys in a table. Each partial key represents at least one row in the table. The method includes reordering one or more columns of the table by cardinality of partial keys, in which the cardinality of a partial key represents a number of distinct values of the partial key. The method further includes determining whether pairs of partial keys are functionally dependent and eliminating one or more columns having functional dependencies from the table.
Abstract:
Methods and apparatus, including computer systems and program products, for executing a query on a subset of data, for example, to facilitate a fast search with a very large result set. In one general aspect, a method of executing a query includes receiving a query for execution on data in the data repository; generating an estimate of a number of results of the query; defining a subset of data in the data repository; determining whether to execute the query on the subset of the data; executing the query on the subset of the data to generate a partial set of results if the query is to be executed on the subset of the data, otherwise executing the query on the data repository to generate a complete set of results; and providing query results.
Abstract:
In a business system, one or more methods can be used to reduce an amount of redundancy in the storage of data. One implementation includes a method of reducing a memory footprint of a database table having multiple rows and one or more columns, in which each of the one or more columns has a cardinality, and the cardinality is a total number of different values in the rows of each column. The method includes comparing the cardinality with a total number of possible values in the rows of at least one column based on a width of the column. The method also includes reducing the width of the column if the cardinality is less than a threshold based on the total number of possible values in the rows of the column.
Abstract:
A method and system for executing an information retrieval query in a multiserver computing environment is disclosed. The method and system employ a technique in which the query is distributed among each of a plurality of partial index servers in the multiserver environment, and a subset of results is calculated for each of the plurality of partial index servers. Then, the subset of results are merged in one logical index server to generate a merged result.
Abstract:
Methods and apparatus, including computer systems and program products, for processing queries for which a solution requires that an information management system perform logical operations on a data repository. In general, in one aspect, the techniques feature a method of executing queries on a data repository. That method includes receiving a query, adapted for execution on a data set in the data repository; defining a sample of the data set, where the sample is a subset of the data set; executing the query on the sample; generating an estimate of a result of the execution of the query on the sample; and providing the estimate to a user interface. The method may further include defining an Nth sample, such that the Nth sample is larger than an (N−1) th sample, and generating an Nth estimate of the result based on the execution of the query on the Nth sample.