Abstract:
A root cause analysis engine uses event durations and gradual deletion of events to improve analysis accuracy and reduce the number of required calculations. Matching ratios of relevant rules are recalculated every time notification of an event is received. The calculation results are held in a rule memory in the analysis engine. Each event has a valid duration, and when the duration has expired, that event is deleted from the rule memory. Events held in the rule memory can be deleted without affecting other events held in the rule memory. The analysis engine can then re-calculate the matching ratio of each rule by only performing the re-calculation with respect to affected rules related to the deleted event. The calculation cost can be reduced because analysis engine processes events incrementally or decrementally. Analysis engine can determine the most possible conclusion even if one or more condition elements were not true.
Abstract:
Provided is an electric machine health monitoring system (100) that includes an electric machine (102), a data acquisition component (206a), a local transmitter (204), a communications network (208), and a remote diagnostic unit (210) that is configured to receive the time sequenced operational information, asset performance, and health status indicators from a local transmitter (204). The remote computational unit (210) comprises software that is configured to perform diagnostic analysis of time sequenced operational information to determine the asset performance and health status of the electric machines.
Abstract:
The present invention provides a fault management method, which can implement fault reporting and processing in an NFV environment. The method includes: acquiring first fault information, including a faulty entity identifier and a fault type, of a network functions virtualization infrastructure NFVI entity, where the first fault information is used to indicate that a fault occurs in a first NFVI entity having the faulty entity identifier; generating first comprehensive fault information according to the first fault information, where the first comprehensive fault information comprises the first fault information and correlated fault information of the first fault information; and performing fault repair or reporting processing according to the first comprehensive fault information. In embodiments of the present invention, fault information of a hardware and/or software entity is acquired, to perform comprehensive processing on correlated pieces of fault information, which can implement fault reporting and processing in an NFV environment.
Abstract:
Usability of a cloud based service is recovered from a system failure. A customer transaction associated with the customer experience is executed to simulate the customer experience in the cloud based service. A failure associated with a subsystem the cloud based service is detected from an output of the customer transaction. A recovery action is determined to be associated with the failure. The recovery action is executed on the subsystem and monitored to determine a success status.
Abstract:
A system and method for implementing geographic disaster tolerance switching on a service delivery platform are disclosed. The system includes: a bidirectional monitoring module, an intelligent recognition module, and an automatic switching module, wherein: the bidirectional monitoring module is configured to: monitor an active site and a standby site of the service delivery platform, and when detecting that an abnormity occurs on the active site or the standby site and an alarm reporting condition is satisfied, report alarm information to the intelligent recognition module; the intelligent recognition module is configured to: receive the alarm information, and judge whether a preset switching rule is satisfied, if yes, then send a disaster tolerance switching instruction to the automatic switching module; and the automatic switching module is configured to: start an active-standby switching between the active site and the standby site after receiving the disaster tolerance switching instruction. With the system the method of embodiments of the present invention, the remote disaster tolerance switching can be implemented automatically and promptly.
Abstract:
A system and method for controlling access to digital streaming data (502). The media server generates an authorization ticket and compares (522) it to one generated by the web server (518) to determine whether to grant access (530).
Abstract:
An anomaly detection and resolution system (ADRS) is disclosed for automatically detecting and resolving anomalies in computing environments. The ADRS may be implemented using an anomaly classification system defining different types of anomalies (e.g., a defined anomaly and an undefined anomaly). A defined anomaly may be based on bounds (fixed or seasonal) on any metric to be monitored. An anomaly detection and resolution component (ADRC) may be implemented in each component defining a service in a computing system. An ADRC may be configured to detect and attempt to resolve an anomaly locally. If the anomaly event for an anomaly can be resolved in the component, the ADRC may communicate the anomaly event to an ADRC of a parent component, if one exists. Each ADRC in a component may be configured to locally handle specific types of anomalies to reduce communication time and resource usage for resolving anomalies.
Abstract:
A system and method for controlling access to digital streaming data (502). The media server generates an authorization ticket and compares (522) it to one generated by the web server (518) to determine whether to grant access (530).