Abstract:
Techniques for mastering resources in a cluster of nodes are provided. A global backup lock manager (GBLM) is maintained for a cluster of nodes that implement distributed lock management. Before a server instance is taken down, for example, for maintenance purposes, such as installing a new version of the server instance code, the mastership information that the server instance stores is reflected in the mastership information maintained by the GBLM. Thus, shutting down the server instance does not involve remastering the resources mastered by the server instance. As a result, shutting down the server instance may take minimal time.
Abstract:
Systems, methods, and other embodiments associated with avoiding resource blockages and hang states are described. One example computer-implemented method for a computing system includes determining that a first process is waiting for a resource and is in a blocked state. The resource that the first process is waiting for is identified. A blocking process that is holding the resource is then identified. A priority of the blocking process is compared with a priority the first process. If the priority of the blocking process is lower than the priority of the first process, the priority of the blocking process is increased. In this manner the blocking process can be scheduled for execution sooner and thus release the resource.
Abstract:
Described is an approach for performing context-aware prognoses in machine learning systems. The approach harnesses streams of detailed data collected from a monitored target to create a context, in parallel to ongoing model operations, for the model outcomes. The context is then probed to identify the particular elements associated with the model findings.
Abstract:
A method, system, and computer program product for analyzing performance of a database cluster. Disclosed are techniques for analyzing performance of components of a database cluster by transforming many discrete event measurements into a time series to identify dominant signals. The method embodiment commences by sampling the database cluster to produce a set of timestamped events, then pre-processing the timestamped events by tagging at least some of the timestamped events with a semantic tag drawn from a semantic dictionary and formatting the set of timestamped events into a time series where a time series entry comprises a time indication and a plurality of values corresponding to signal state values. Further techniques are disclosed for identifying certain signals from the time series to which is applied various statistical measurement criteria in order to isolate a set of candidate signals which are then used to identify indicative causes of database cluster behavior.