Abstract:
A localized marketing system is disclosed that provides discount offers to users that match merchant criteria including proximity. Further, a system for actively probing populations of users with different parameters and monitoring responses can be employed to collect data for identifying the best discounts and deadlines to offer to users to achieve desired results.Another aspect of the disclosure pertains to web searches and more particularly toward influencing resultant content to increase relevancy. The resultant content can be influenced by reconfiguring a query and/or filtering results based on user location and/or context information (e.g., user characteristics/profile, prior interaction/usage temporal, current events, and third party state/context . . . ). Furthermore, the disclosure provides for query execution on at least a subset of designated web content, for example as specified by a user.
Abstract:
A method of estimating cardinality of a join of tables using multi-column density values and additionally using coarser density values of a subset of the multi-column density attributes. In one embodiment, the subset of attributes for the coarser densities is a prefix of the set of multi-column density attributes. A number of tuples from each table that participate in the join may be estimated using densities of the subsets. The cardinality of the join can be estimated using the multi-column density for each table and the estimated number of tuples that participate in the join from each table.
Abstract:
A database object summarization tool is provided that selects a subset of database objects subject to filtering constraints such as a partial order or optimization of some attribute. A dominance primitive filters out tuples that are dominated according to a partial order constraint by another tuple. A representation primitive selects a representative subset of tuples such than an optimization criteria is met.
Abstract:
A process for finding a similar data records from a set of data records. A database table or tables provide a number of data records from which one or more canonical data records are identified. Tokens are identified within the data records and classified according to attribute field. A similarity score is assigned to data records in relation to other data records based on a similarity between tokens of the data records. Data records whose similarity score with respect to each other is greater than a threshold form one or more groups of data records. The records or tuples form nodes of a graph wherein edges between nodes represent a similarity score between records of a group. Within each group a canonical record is identified based on the similarity of data records to each other within the group.
Abstract:
A system that facilitates estimating functional relationships associated with one or more columns in a database comprises a sampling component that receives a random sample of records within the database. An estimate generator component calculates an estimate of strength of functional relationships based at least in part upon the received samples. For example, the estimate generator component can calculate an estimate of strength of a column as a key column based at least in part upon the received samples.
Abstract:
Techniques for estimating the progress of database queries are described herein. In a first implementation, a respective lower-bound parameter is associated with each node in an operator tree that representing a given database query, and the progress of the database query at a given point is estimated based upon the lower-bound parameters. In a second implementation, the progress of the query is estimated by associating respective lower-bound and upper-bound parameters with each node in the operator tree. The progress of the query at the given point is then estimated based on the lower-bound and upper-bound parameters. The progress estimate is computed by dividing the work done so far by the sums of the above averages for each node in the tree.
Abstract:
A method of estimating selectivity of a given string predicate in a database query. In the method selectivities of substrings of various substring lengths are estimated. For example, the selectivity of substrings between length l (or some constant q) to the length of the given string predicate may be estimated. The method then selects a candidate sub string for each sub string length based on estimated selectivities of the substrings. The estimated selectivities of the candidate substrings are combined. The combined estimated selectivity of the candidate substrings is returned as the estimated selectivity of the given string predicate.
Abstract:
A system that facilitates automatic selection of a physical configuration of a database comprises an optimizer component that determines simulated physical structures and creates a hypothetical configuration based thereon. A reduction component progressively reduces size of the configuration until the hypothetical configuration is associated with a size below a threshold. For example, the simulated physical structures can be based at least in part upon a workload.
Abstract:
An automated physical database design tool may provide an integrated physical design recommendation for horizontal partitioning, indexes and indexed views, all three features being tuned together (in concert). Manageability requirements may be specified when optimizing for performance. User-specified configuration may enable the specification of a partial physical design without materialization of the physical design. The tuning process may be performed for a production server but may be conducted substantially on a test server. Secondary indexes may be suggested for XML columns. Tuning of a database may be invoked by any owner of a database. Usage of objects may be evaluated and a recommendation for dropping unused objects may be issued. Reports may be provided concerning the count and percentage of queries in the workload that reference a particular database, and/or the count and percentage of queries in the workload that reference a particular table or column. A feature may be provided whereby a weight may be associated with each statement in the workload, enabling relative importance of particular statements to be specified. An in-row length for a column may be specified. If a value for the column exceeds the specified in-row length for that column, the portion of the value not exceeding the specified in-row length may be stored in the row while the portion of the value exceeding the specified in-row length may be stored in an overflow area. Rebuild and reorganization recommendations may be generated.
Abstract:
A method of estimating the Results of a database query are estimated by performing a sampling of weighted tuples in a database based on a probability of usage of tuples required in executing a workload. A probability is associated with each tuple sampled. And, can aggregate is computed over values in each sampled tuple while multiplying by the inverses of the probabilities associated with each tuple sampled.