Abstract:
An exact cardinality query optimization system and method for optimizing a query having a plurality of expressions to obtain a cardinality-optimal query execution plan for the query. Embodiments of the system and method use various techniques to shorten the time necessary to obtain the cardinality-optimal query execution plan, which contains the query execution plan when all cardinalities are exact. Embodiments of the system and method include a covering queries technique that leverages query execution feedback to obtain an unordered subset of relevant expressions for the query, an early termination technique that bounds the cardinality to determine whether the processing can be terminate before each of the expressions are executed, and an expressions ordering technique that finds an ordering of expressions that yields the greatest reduction in time to obtain the cardinality-optimal query execution plan.
Abstract:
A database server may be configured to compute distinct page counts of pages accessed to execute operands of respective queries. The queries may be executed against a table comprised of the pages and having an index managed by the database server. The distinct page counts may be obtained by counting, as a part of the executing of the queries, distinct pages accessed during the execution of the queries.
Abstract:
A proactive monitoring mechanism for correcting the choice of access methods (available query plans) for a given query, based on execution feedback from the same query. The mechanism exploits bypassing predicate short-circuiting inside the database server's predicate evaluation module to obtain expression cardinalities. The mechanism can also modify a plan to obtain expression cardinalities. These techniques are used judiciously by the query optimizer and/or a database administrator (DBA) so that the execution overheads are within acceptable limits.
Abstract:
Techniques for estimating the progress of database queries are described herein. In a first implementation, a respective lower-bound parameter is associated with each node in an operator tree that representing a given database query, and the progress of the database query at a given point is estimated based upon the lower-bound parameters. In a second implementation, the progress of the query is estimated by associating respective lower-bound and upper-bound parameters with each node in the operator tree. The progress of the query at the given point is then estimated based on the lower-bound and upper-bound parameters.
Abstract:
Database systems use a plan cache to avoid the overheads (e.g., time, money) of query recompilation. Query plans can become invalidated by updates to the statistics on data or changes to the physical database design. Once a plan is invalidated, it can be repaired utilizing one or more of the disclosed embodiments. Incremental repair of query plans includes reusing parts of the current plan rather than discarding the plan entirely when it is invalidated. Repair to an existing query plan is attempted before resorting to full recompilation.
Abstract:
A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on.
Abstract:
The subject disclosure pertains to web searches and more particularly toward influencing resultant content to increase relevancy. The resultant content can be influenced by reconfiguring a query and/or filtering results based on user location and/or context information (e.g., user characteristics/profile, prior interaction/usage temporal, current events, and third party state/context . . . ). Furthermore, the disclosure provides for query execution on at least a subset of designated web content, for example as specified by a user. Still further yet, a localized marketing system is disclosed that provides discount offers to users that match merchant criteria including proximity. A system for actively probing populations of users with different parameters and monitoring responses can be employed to collect data for identifying the best discounts and deadlines to offer to users to achieve desired results.
Abstract:
Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.
Abstract:
A fuzzy joins system that is integrated in a database system generates fuzzy joins between records from two datasets. The fuzzy joins system includes a tokenizer to generate tokens for data records and a transformer to find transforms for the tokens. The fuzzy joins system invokes a signature generator, running within a runtime layer of the database system, to generate signatures for data records based on the tokens and their transforms. Subsequently, an equi-join operation joins the records from the two datasets with at least one equal signature. A similarity calculator, running within a runtime layer of the database system, computes a similarity measure using the token information of the joined records. If the similarity measure for any two records is above a threshold, the fuzzy joins system generates a fuzzy join between such two records.
Abstract:
Technology is described for transformation rule profiling for a query optimizer. The method can include obtaining a database query configured to be optimized by the query optimizer of a database system. An optimized query plan for the database query can be found using a host set of transformation rules. One transformation rule can be removed and checked at a time. Each transformation rule can be checked to determine whether the transformation rule affects an optimal query plan output. A test query plan can be generated after each transformation rule has been removed. The query optimizer can determine whether the test query plan is different than the optimized query plan in the absence of the removed transformation rule. An equivalent set of transformation rules can be created that includes transformation rules where the test query plan generated from the equivalent set of transformation rules is equivalent to the optimized plan.