摘要:
A method for troubleshooting abnormal behavior of an application hosted on a networked computer system. The method may be implemented by a root cause analyzer. The method includes tracking a single application performance metric across all the clients of an application hosted on a networked computer system and analyzing an aggregated application based on the single application metric. The method involves determining outlier client attributes associated with an abnormal transaction of the application and ranking the outlier client attributes based on comparisons of historical and current abnormal transactions. The method associates one or more of the ranked outlier client attributes with the root cause of the current abnormal transaction. Association rule learning is used to associate one or more of the ranked outlier client attributes with the root cause.
摘要:
A method for troubleshooting abnormal behavior of an application hosted on a networked computer system. The method may be implemented by a root cause analyzer. The method includes tracking a single application performance metric across all the clients of an application hosted on a networked computer system and analyzing an aggregated application based on the single application metric. The method involves determining outlier client attributes associated with an abnormal transaction of the application and ranking the outlier client attributes based on comparisons of historical and current abnormal transactions. The method associates one or more of the ranked outlier client attributes with the root cause of the current abnormal transaction. Association rule learning is used to associate one or more of the ranked outlier client attributes with the root cause.
摘要:
A resource monitoring method may include receiving a request to perform a read operation on an object at a first monitoring node of a plurality of monitoring nodes, and determining whether or not a copy of the object is present in a namespace associated with the first monitoring node. The namespace may include an overlay namespace and a local namespace. The local namespace may identify objects being monitored by the first monitoring node. The overlay namespace may include local viewpoints for other monitoring nodes of the plurality of monitoring nodes. Each local viewpoint may identify one or more objects that are monitored by a respective other monitoring node. The method may further include performing, by the first monitoring node, the read operation on the object if the copy of the object is determined as present in the namespace associated with the first monitoring node.
摘要:
Systems and techniques for identifying a common change window for one or more services implemented on one or more hosts include querying time series performance data for each host of a service to identify time slots of low resource consumption on the host, annotating the time slots with service tags, where the service tags identify host information and service information, creating groups of time slots using the service tags, using dynamic clustering to create clusters of hosts using the groups of time slots, and generating at least one common change window by eliminating duplicate hosts from the clusters of the hosts.
摘要:
A resource monitoring method may include determining, at a first monitoring node, a load level for each monitoring node of a plurality of monitoring nodes including the first monitoring node and a second monitoring node based on a namespace associated with the first monitoring node. The namespace may include an overlay namespace and a local namespace. The local namespace may identify objects being monitored by the first monitoring node. The overlay namespace may include local viewpoints for other monitoring nodes of the plurality of monitoring nodes including the second monitoring node. Each local viewpoint may identify one or more objects that are monitored by a respective other monitoring node. The method may further include prohibiting the first monitoring node from instituting a new object creation request if the load level of the second monitoring node is lower than the load level of the first monitoring node.