Abstract:
The disclosed embodiments provide a system that analyzes telemetry data from a computer system. During operation, the system obtains the telemetry data, which includes first information containing telemetric signals gathered using sensors in the computer system and second information that indicates one or more transaction latencies of software running on the computer system. Upon detecting an upward trend in the one or more transaction latencies, the system analyzes the telemetry data for a correlation between the one or more transaction latencies and one or more environmental factors represented by a subset of the telemetric signals. Upon identifying the correlation between the one or more transaction latencies and an environmental factor, the system stores an indication that the environmental factor may be contributing to the upward trend in the one or more transaction latencies.
Abstract:
The disclosed embodiments provide a system that detects anomalous events. During operation, the system obtains machine-generated time-series performance data collected during execution of a software program in a computer system. Next, the system removes a subset of the machine-generated time-series performance data within an interval around one or more known anomalous events of the software program to generate filtered time-series performance data. The system uses the filtered time-series performance data to build a statistical model of normal behavior in the software program and obtains a number of unique patterns learned by the statistical model. When the number of unique patterns satisfies a complexity threshold, the system applies the statistical model to subsequent machine-generated time-series performance data from the software program to identify an anomaly in an activity of the software program and stores an indication of the anomaly for the software program upon identifying the anomaly.
Abstract:
We disclose a system that executes an inferential model in VRAM that is embedded in a set of graphics-processing units (GPUs). The system obtains execution parameters for the inferential model specifying: a number of signals, a number of training vectors, a number of observations and a desired data precision. It also obtains one or more formulae for computing memory usage for the inferential model based on the execution parameters. Next, the system uses the one or more formulae and the execution parameters to compute an estimated memory footprint for the inferential model. The system uses the estimated memory footprint to determine a required number of GPUs to execute the inferential model, and generates code for executing the inferential model in parallel while efficiently using available memory in the required number of GPUs. Finally, the system uses the generated code to execute the inferential model in the set of GPUs.
Abstract:
Systems and methods are described that estimates a remaining useful life (RUL) of an electronic device. Time-series signals gathered from sensors in the electronic device are received. Statistical changes are detected in the set of time-series signals that are deemed as anomalous signal patterns. Anomaly alarms are generated, wherein an anomaly alarm is generated for each of the anomalous signal patterns. An irrelevance filter is applied to the set of anomaly alarms to produce filtered anomaly alarms, wherein the irrelevance filter removes anomaly alarms associated with anomalous signal patterns that are not correlated with previous failures of similar electronic devices. A logistic-regression model is used to compute an RUL-based risk index for the electronic device based on the filtered anomaly alarms. When the risk index exceeds a risk-index threshold, a notification is generated indicating that the electronic device has a limited remaining useful life.
Abstract:
Systems, methods, and other embodiments associated with automated calibration in electromagnetic scanners are described. In one embodiment, a method includes: detecting one or more peak frequency bands in electromagnetic signals collected by the electromagnetic scanner at a geographic location; comparing the one or more peak frequency bands to broadcast frequencies assigned to local radio stations of the geographic location; and indicating that the electromagnetic scanner is calibrated by finding at least one match between one peak frequency band of the peak frequency bands and one of the broadcast frequencies. An electromagnetic scanner may be recalibrated based on comparing the one or more peak frequency bands to broadcast frequencies.
Abstract:
Techniques for using machine learning model validated sensor data to generate recommendations for remediating issues in a monitored system are disclosed. A machine learning model is trained to identify correlations among sensors for a monitored system. Upon receiving current sensor data, the machine learning model identifies a subset of the current sensor data that cannot be validated. The system generates estimated values for the sensor data that cannot be validated based on the learned correlations among the sensor values. The system generates the recommendations for remediating the issues in the monitored system based on validated sensor values and the estimated sensor values.
Abstract:
Systems, methods, and other embodiments associated with unified control of cooling in computers are described. In one embodiment, a method locks operation of first and second cooling mechanisms configured to cool one or more components in the computer. In response to a first condition, the method unlocks the operation of the first cooling mechanism to allow the first cooling mechanism to make cooling adjustments while the operation of the second cooling mechanism is locked. In response to a second condition, the method unlocks the operation of the second cooling mechanism to allow the second cooling mechanism to make cooling adjustments while the operation of the first cooling mechanism is locked. In the method, the first cooling mechanism and the second cooling mechanism are prevented from making the cooling adjustments simultaneously.
Abstract:
The disclosed system produces synthetic signals for testing machine-learning systems. During operation, the system generates a set of N composite sinusoidal signals, wherein each of the N composite sinusoidal signals is a combination of multiple constituent sinusoidal signals with different periodicities. Next, the system adds time-varying random noise values to each of the N composite sinusoidal signals, wherein a standard deviation of the time-varying random noise values varies over successive time periods. The system also multiplies each of the N composite sinusoidal signals by time-varying amplitude values, wherein the time-varying amplitude values vary over successive time periods. Finally, the system adds time-varying mean values to each of the N composite sinusoidal signals, wherein the time-varying mean values vary over successive time periods. The time-varying random noise values, amplitude values and mean values can be selected through a roll-of-the-die process from a library of values, which are learned from industry-specific signals.
Abstract:
During operation, the system obtains the time-series sensor signals, which were gathered from sensors in a monitored system. Next, the system classifies the time-series sensor signals into stair-stepped signals and un-stair-stepped signals. The system then replaces stair-stepped values in the stair-stepped signals with interpolated values determined from un-stair-stepped values in the stair-stepped signals. Next, the system divides the time-series sensor data into a training set and an estimation set. The system then trains an inferential model on the training set, and uses the trained inferential model to replace interpolated values in the estimation set with inferential estimates. Next, the system switches roles of the training and estimation sets to produce a new training set and a new estimation set. The system then trains the inferential model on the new training set, and uses the trained inferential model to replace interpolated values in the new estimation set with inferential estimates.
Abstract:
The disclosed embodiments provide a system that detects unwanted electronic components in a target computing system. During operation, the system obtains target electromagnetic interference (EMI) signals, which were gathered by monitoring EMI signals generated by the target computing system, using an insertable device, wherein when the insertable device is inserted into the target computing system, the insertable device gathers the target EMI signals from the target computing system. Next, the system generates a target EMI fingerprint from the target EMI signals. Finally, the system compares the target EMI fingerprint against a reference EMI fingerprint for the target computing system to determine whether the target computing system contains any unwanted electronic components.