摘要:
A system that balances thermal variations within a set of computer systems in a datacenter. During operation, the system obtains a thermal flux map for the set of computer systems. The system then analyzes the thermal flux map to determine whether imbalances exist in the thermal flux across the set of computer systems. If so, the system can adjust: (1) the scheduling of loads across the set of computer systems, and/or (2) air conditioning within the datacenter, so that the thermal flux is more balanced across the set of computer systems.
摘要:
A system that monitors electromagnetic interference (EMI) signals to facilitate proactive fault monitoring in a computer system is presented. During operation, the system receives EMI signals from one or more antennas located in close proximity to the computer system. The system then analyzes the received signals to proactively detect anomalies during operation of the computer system.
摘要:
A system that generates a dynamic power-flux map for a set of computer systems. During operation the system determines the locations of the computer systems. Next, the system receives dynamic traces of power consumption for the computer systems, wherein a dynamic trace of power consumption for a given computer system is generated based on dynamic traces of monitored inferential variables for the given computer system. The system then correlates the locations of the computer systems with the dynamic traces of power consumption for the computer systems, and generates the dynamic power-flux map for the set of computer systems based on the correlated locations and the dynamic traces for the computer systems.
摘要:
A system that monitors electromagnetic interference (EMI) signals to facilitate proactive fault monitoring in a computer system is presented. During operation, the system receives EMI signals from one or more antennas located in close proximity to the computer system. The system then analyzes the received signals to proactively detect anomalies during operation of the computer system.
摘要:
A system that generates a dynamic power-flux map for a set of computer systems. During operation the system determines the locations of the computer systems. Next, the system receives dynamic traces of power consumption for the computer systems, wherein a dynamic trace of power consumption for a given computer system is generated based on dynamic traces of monitored inferential variables for the given computer system. The system then correlates the locations of the computer systems with the dynamic traces of power consumption for the computer systems, and generates the dynamic power-flux map for the set of computer systems based on the correlated locations and the dynamic traces for the computer systems.
摘要:
Some embodiments of the present invention provide a system that characterizes a computer system using a pattern-recognition model. First, values for an environmental parameter are monitored from a set of sensors associated with the computer system. Then, a baseline for the environmental parameter is calculated based on the monitored values from a subset of the set of sensors. Next, the baseline is subtracted from the monitored values from sensors in the set of sensors to produce compensated values. Then, the compensated values are used as inputs to the pattern-recognition model, which produces estimates for the compensated values based on correlations between the compensated values learned during a training phase. Next, residuals are calculated by subtracting the estimates for the compensated values from the compensated values. Then, the residuals are analyzed to characterize the computer system.
摘要:
A system that mitigates quantization effects in quantized telemetry signals. During operation, the system monitors a set of quantized telemetry signals. For a given quantized telemetry signal in the set of quantized telemetry signals, the system uses a set of models to generate a set of estimates for the given quantized telemetry signal from the other monitored quantized telemetry signals, wherein each model in the set of models was initialized using a different randomly selected subset of a training dataset. The system then averages the set of estimates to produce an estimated signal for the given quantized telemetry signal.
摘要:
A computer system to predict a value of a signal from a sensor schedule loads across a set of processor cores is described. During operation, the computer system generates N models to predict the value of the signal based on a set of quantized telemetry signals, where a given model produces a value of the signal using a subset of the set of quantized telemetry signals, and where the subset is selected from the set of quantized telemetry signals based on an objective criterion. Next, the computer system predicts the value of the signal by aggregating the values produced by the N models.
摘要:
Some embodiments of the present invention provide a system that characterizes a computer system using a pattern-recognition model. First, values for an environmental parameter are monitored from a set of sensors associated with the computer system. Then, a baseline for the environmental parameter is calculated based on the monitored values from a subset of the set of sensors. Next, the baseline is subtracted from the monitored values from sensors in the set of sensors to produce compensated values. Then, the compensated values are used as inputs to the pattern-recognition model, which produces estimates for the compensated values based on correlations between the compensated values learned during a training phase. Next, residuals are calculated by subtracting the estimates for the compensated values from the compensated values. Then, the residuals are analyzed to characterize the computer system.
摘要:
Embodiments of the present invention provides a system that optimizes a regression model which predicts a signal as a function of a set of available signals. These embodiments use a genetic technique to optimize the regression model, which involves using a portion of the sample signals used to generate each parent regression model from a pair of best-fit parent regression models to generate a child regression model. In addition, in embodiments of the present invention, the system introduces “mutations” to the set of sample signals used to create the child regression model in an attempt to create more robust child regression models during the optimization process.