摘要:
A table-level histogram is maintained incrementally without requiring rescanning of the entire table when new data values are added to the table. A table has multiple partitions of data values. A histogram for data values of the partitions is generated. When a new partition of data values is added to the table, a histogram for only the new partition is generated. To generate a histogram for the entire table, the histograms for the previously generated and newly added partitions are used without needing to refer to the underlying data. A similar approach is applicable when modifying data values in a partition.
摘要:
Techniques for proactively and reactively running diagnostic functions. These diagnostic functions help to improve diagnostics of conditions detected in a monitored system and to limit/quarantine the damages caused by the detected conditions. In one embodiment, a health monitor infrastructure is provided that is configured to perform one or more health checks in a monitored system for diagnosing and/or gathering information related to the system. The one or more health checks may be invoked pro-actively on a scheduled basis, reactively in response to a condition detected in the system, or may even be invoked manually by a user such as a system administrator.
摘要:
One or more usage models are provided for a database. Each usage model includes a set of rules that are used to analyze database performance. A usage model in one or more usage models is determined. Database information is determined based on the usage model. The database information is then analyzed based on rules associated with the usage model. One or more performance problems are determined based on the analysis.
摘要:
Techniques for self-diagnosing performance problems in a database are provided. The techniques include classifying one or more performance problems in a database system. One or more values for quantifying an impact of the one or more performance problems on the database system are then determined. The quantified values are determined based on the performance of operations in the database system. A performance problem based on the one or more quantified values is then determined. A solution for the performance problem is generated and may be outputted.
摘要:
A diagnosability system for automatically collecting, storing, communicating, and analyzing diagnostic data for one or more monitored systems. The diagnosability system comprises several components configured for the collection, storage, communication, and analysis of diagnostic data for a condition detected in monitored system. The diagnosability system enables targeted dumping of diagnostic data so that only diagnostic data that is relevant for diagnosing the condition detected in the monitored system is collected and stored. This in turn enables first failure analysis thereby reducing the time needed to resolve the condition detected in the monitored system.
摘要:
A method and apparatus for approximating a database statistic, such as the number of distinct values (NDV) is provided. To approximate the NDV for a portion of a table, a synopsis of distinct values is constructed. Each value in the portion is mapped to a domain of values. The mapping function is implemented with a uniform hash function, in one embodiment. If the resultant domain value does not exist in the synopsis, the domain value is added to the synopsis. If the synopsis reaches its capacity, a portion of the domain values are discarded from the synopsis. The statistic is approximated based on the number (N) of domain values in the synopsis and the portion of the domain that is represented in the synopsis relative to the size of the domain.
摘要:
A diagnosability system for automatically collecting, storing, communicating, and analyzing diagnostic data for one or more monitored systems. The diagnosability system comprises several components configured for the collection, storage, communication, and analysis of diagnostic data for a condition detected in monitored system. The diagnosability system enables targeted dumping of diagnostic data so that only diagnostic data that is relevant for diagnosing the condition detected in the monitored system is collected and stored. This in turn enables first failure analysis thereby reducing the time needed to resolve the condition detected in the monitored system.
摘要:
A method and apparatus for merging synopses to determine a database statistic, e.g., a number of distinct values (NDV), is disclosed. The merging can be used to determine an initial database statistic or to perform incremental statistics maintenance. For example, each synopsis can pertain to a different partition, such that merging the synopses generates a global statistic. When performing incremental maintenance, only those synopses whose partitions have changed need to be updated. Each synopsis contains domain values that summarize the statistic. However, the synopses may initially contain domain values that are not compatible with each other. Prior to merging the synopses the domain values in each synopsis is made compatible with the domain values in the other synopses. The adjustment is made such that each synopsis represents the same range of domain values, in one embodiment. After “compatible synopses” are formed, the synopses are merged by taking the union of the compatible synopses.
摘要:
Techniques for self-diagnosing performance problems in a database are provided. The techniques include classifying one or more performance problems in a database system. One or more values for quantifying an impact of the one or more performance problems on the database system are then determined. The quantified values are determined based on the performance of operations in the database system. A performance problem based on the one or more quantified values is then determined. A solution for the performance problem is generated and may be outputted.
摘要:
A method for determining k nearest-neighbors to a query point in a database in which an ordering is defined for a data set P of a database, the ordering being based on l one-dimensional codes C.sub.1, . . . , C.sub.1. A single relation R is created in which R has the attributes of index-id, point-id and value. An entry (j,i,C.sub..epsilon.j (p.sub.i)) is included in relation R for each data point p.sub.i .EPSILON.P, where index-id equals j, point-id equals i, and value equals C.sub..epsilon.j (p.sub.i). A B-tree index is created based on a combination of the index-id attribute and the value attribute. A query point is received and a relation Q is created for the query point having the attributes of index-id and value. One tuple is generated in the relation Q for each j, j=1, . . . , l, where index-id equals j and value equals C.sub..epsilon.j (q). A distance d is selected. The index-id attribute for the relation R of each data point p.sub.i is compared to the index-id attribute for the relation Q of the query point. A candidate data point p.sub.i is selected when the comparison of the relation R of a data point p.sub.i to the index-id attribute for the relation Q of the query point is less than the distance d. Lower bounds are calculated for each cube of the plurality of cubes that represent a minimum distance between any point in a cube and the query point. Lastly, k candidate data points p.sub.i are selected as k nearest-neighbors to the query point.