Abstract:
We present a system that performs prognostic surveillance operations based on sensor signals from a power plant and critical assets in the transmission and distribution grid. The system obtains signals comprising time-series data obtained from sensors during operation of the power plant and associated transmission grid. The system uses an inferential model trained on previously received signals to generate estimated values for the signals. The system then performs a pairwise differencing operation between actual values and the estimated values for the signals to produce residuals. The system subsequently performs a sequential probability ratio test (SPRT) on the residuals to detect incipient anomalies that arise during operation of the power plant and associated transmission grid. While performing the SPRT, the system dynamically updates SPRT parameters to compensate for non-Gaussian artifacts that arise in the sensor data due to changing operating conditions. When an anomaly is detected, the system generates a notification.
Abstract:
The disclosed embodiments provide a system that proactively resilvers a disk array when a disk drive in the array is determined to have an elevated risk of failure. The system receives time-series signals associated with the disk array during operation of the disk array. Next, the system analyzes the time-series signals to identify at-risk disk drives that have an elevated risk of failure. If one or more disk drives are identified as being at-risk, the system performs a proactive resilvering operation on the disk array using a background process while the disk array continues to operate using the at-risk disk drives.
Abstract:
The disclosed embodiments provide a system that intelligently migrates workload between servers in a data center to improve efficiency in associated power supplies. During operation, the system receives time-series signals associated with the servers during operation of the data center, wherein the servers include low-priority servers and high-priority servers. Next, the system analyzes the time-series signals to predict a load utilization for the servers. The system then migrates workload between the servers in the data center based on the predicted load utilization so that: the high-priority servers have sufficient workload to ensure that associated power supplies for the high-priority servers operate in a peak-efficiency range; and the low-priority servers operate with less workload or no workload.
Abstract:
After sensors are placed at three or more non-collinear locations on a surface of the component, the system receives time-series signals from the sensors while the component operates on a representative workload. The system then defines one or more triangles on the surface of the component, wherein each triangle is defined by three vertices, which coincide with different sensor locations on the surface of the component. For each triangle, the system applies a barycentric coordinate technique (BCT) to time-series signals received from sensors located at the vertices of the triangle to determine a candidate location within the triangle to place an additional sensor. The system then compares the candidate locations for each of the one or more triangles to determine a globally optimal location for the additional sensor, and a new sensor is placed at this location. This process is repeated until a desired number of sensors are placed.
Abstract:
During a surveillance mode, the system receives present time-series signals gathered from sensors in the power transformer. Next, the system uses an inferential model to generate estimated values for the present time-series signals, and performs a pairwise differencing operation between actual values and the estimated values for the present time-series signals to produce residuals. The system then performs a sequential probability ratio test on the residuals to produce alarms having associated tripping frequencies (TFs). Next, the system uses a logistic-regression model to compute a risk index for the power transformer based on the TFs. If the risk index exceeds a threshold, the system generates a notification that the power transformer needs to be replaced. The system also periodically updates the logistic-regression model based on the results of periodic dissolved gas analyses for the transformer to more accurately compute the index for the power transformer.
Abstract:
The disclosed embodiments relate to a system that certifies provenance of time-series data in a time-series database. During operation, the system retrieves time-series data from the time-series database, wherein the time-series data comprises a sequence of observations comprising sensor readings for each signal in a set of signals. The system also retrieves multivariate state estimation technique (MSET) estimates, which were computed for the time-series data, from the time-series database. Next, the system performs a reverse MSET computation to produce reconstituted time-series data from the MSET estimates. The system then compares the reconstituted time-series data with the time-series data. If the reconstituted time-series data matches the original time-series data, the system certifies provenance for the time-series data.
Abstract:
The disclosed embodiments relate to a system that detects degradation in one or more rotating components in a monitored system. During operation, the system receives one or more telemetry signals comprising vibration sensor readings from one or more vibration sensors in the monitored system. The system then performs a fast Fourier transform (FFT) on the vibration sensor readings to produce a power spectral density (PSD) distribution. Next, the system identifies a peak in the PSD distribution, wherein the peak is associated with a target rotating component in the monitored system. After identifying the peak, the system computes a full width half maximum (FWHM) value for a curve associated with the peak. Finally, if the FWHM value exceeds a pre-specified threshold, the system generates a notification about degradation of the target rotating component in the monitored system.
Abstract:
The disclosed embodiments provide a system that analyzes telemetry data from a computer system. During operation, the system obtains the telemetry data, which includes first information containing telemetric signals gathered using sensors in the computer system and second information that indicates one or more transaction latencies of software running on the computer system. Upon detecting an upward trend in the one or more transaction latencies, the system analyzes the telemetry data for a correlation between the one or more transaction latencies and one or more environmental factors represented by a subset of the telemetric signals. Upon identifying the correlation between the one or more transaction latencies and an environmental factor, the system stores an indication that the environmental factor may be contributing to the upward trend in the one or more transaction latencies.