摘要:
A system that identifies processes with a memory leak in a computer system. During operation, the system periodically samples memory usage for processes running on the computer system. The system then ranks the processes by memory usage and selects a specified number of processes with highest memory usage based on the ranking. For each selected process, the system computes a first-order difference of memory usage by taking a difference between the memory usage at a current sampling time and the memory usage at an immediately preceding sampling time. The system then generates a memory-leak index based on the first-order difference and a preceding memory-leak index computed at the immediately preceding sampling time.
摘要:
A system that detects a change point in a time series of telemetry signals from a computer system. During operation, the system receives the time series of telemetry signals from the computer system. For each element in the time series, the system (1) inserts the element into a data structure which keeps track of the number of elements in the data structure that have a value greater than and that have a value less than the value of the inserted element; and (2) uses the information stored in the data structure to add a contribution by the inserted element to a trend statistic for the time series. The system then uses the trend statistic to select a hypothesis for the trend in the time series.
摘要:
One embodiment of the present invention provides a system that resamples a quantized signal. During operation, the system receives the quantized signal. Next, the system smoothes and resamples the quantized signal to produce a resampled signal. The system then quantizes the resampled signal to produce a quantized resampled signal. For a given time point, the system determines a probability distribution for the resampled signal across quantization levels at the given time point by using information about the values of the resampled signal at neighboring time points. Note that the probability distribution specifies the probability that the resampled signal would be sampled at specific quantization levels. The system then uses the probability distribution to probabilistically select a quantization level for the resampled signal for the given time point.
摘要:
One embodiment of the present invention provides a system that optimizes support vector machine (SVM) kernel parameters. During operation, the system assigns sets of kernel parameter values to each node in a multiprocessor system. Next, the system performs a cross-validation operation at each node in the multiprocessor system based on a data set. This cross-validation operation computes an error cost value reflecting the number of misclassifications that arise while classifying the data set using the assigned set of kernel parameter values. The system then communicates the computed error cost values between nodes in the multiprocessor system, and eliminates nodes with relatively high error cost values. Next, the system performs a cross-over operation in which kernel parameter values are exchanged between remaining nodes to produce new sets of kernel parameter values. This process is repeated until a global winning set of kernel parameter values emerges.
摘要:
Some embodiments provide a system that analyzes telemetry data from a computer system. During operation, the system obtains the telemetry data as a set of telemetric signals from the computer system and validates the telemetric signals using a nonlinear, nonparametric regression technique. Next, the system assesses the integrity of a power supply unit (PSU) in the computer system by comparing the telemetric signals to one or more reference telemetric signals associated with the computer system. If the assessed integrity falls below a threshold, the system performs a remedial action for the computer system.
摘要:
One embodiment of the present invention provides a system that estimates the relative humidity inside a computer system. During operation, a set of performance parameters of the computer system and an external relative humidity outside of the computer system are monitored. Then, the relative humidity inside the computer system is estimated based on the set of performance parameters, the external relative humidity, and a relative humidity model, wherein training of the relative humidity model includes measuring an external training relative humidity outside of the computer system and a training relative humidity inside the computer system while monitoring the set of performance parameters of the computer system.
摘要:
Embodiments of a computer system that includes a vibration-cancelling mode, and a related method and computer-program product (e.g., software) for use with the computer system, are described. During operation, a processor monitors operations in the computer system, and may select either the vibration-cancelling mode or an inactive mode based on the monitored operations. For example, the processor may select the vibration-cancelling mode when there are input/output-(I/O) intensive workloads to an array of one or more hard disk drives (HDDs) in the computer system. In this way, the processor may reduce the energy consumption associated with vibration-induced retries to the HDDs (and reduced I/O throughput) without increasing the energy consumption associated with active vibration damping at other times, such as when the computer system is idle or during processor-intensive workloads.
摘要:
A system for generating a power consumption model of at least one server includes one or more computers configured to obtain n time series telemetry signals indicative of operating parameters of the at least one server, obtain a time series power signal indicative of power consumed by the at least one server, and correlate each of the n time series telemetry signals with the time series power signal. The one or more computers are further configured to select a set of the n time series telemetry signals having an overall correlation with the time series power signal greater than a predetermined threshold, and generate a power consumption model of the at least one server based on at least the set of the n time series telemetry signals.
摘要:
One embodiment provides a system that analyzes telemetry data from a computer system. During operation, the system periodically obtains the telemetry data from the computer system. Next, the system preprocesses the telemetry data using a sequential-analysis technique. If a statistical deviation is found in the telemetry data using the sequential-analysis technique, the system identifies a subset of the telemetry data associated with the statistical deviation and applies a root-cause-analysis technique to the subset of the telemetry data to determine a source of the statistical deviation. Finally, the system uses the source of the statistical deviation to perform a remedial action for the computer system, which involves correcting a fault in the computer system corresponding to the source of the statistical deviation.
摘要:
One embodiment of the present invention provides a system that mitigates the effects of multiple vibration sources on a set of hard disk drives (HDDs) within a computer system. During operation, the system identifies a target HDD in the set of HDDs, wherein the performance of the target HDD is affected by mechanical vibrations. The system also identifies one or more primary vibration sources from the multiple vibration sources that affect the performance of the target HDD. Next, for each of the primary vibration sources, the system measures a first time-domain signal associated with the operation of the primary vibration source using a first vibration transducer associated with the primary vibration source. The system also measures a second time-domain signal associated with the target HDD using a second vibration transducer associated with the target HDD. Next, for each of the primary vibration sources, the system then computes a cross-power-spectral-density (CPSD) between the first and the second time-domain signals. The system then selectively mitigates the primary vibration sources based on the CPSDs between the primary vibration sources and the target HDD.