摘要:
One embodiment of the present invention provides a system for predicting a remaining useful life (RUL) for a component in a set of components within a computer system. The system starts by collecting values of at least one degradation-related parameter associated with the operation of a monitored component within the computer system. Note that the degradation-related parameter is a direct measurement of a degree of degradation of the monitored component. The system additionally collects values of at least one stress-based parameter from the computer system. Note that the stress-based parameter measures an accumulative stress in the operating environment of the set of components which can cause degradation of the set of components. The system then uses the values of the at least one degradation-related parameter and the values of the at least one stress-based parameter to predict an RUL for a component in the set of components.
摘要:
One embodiment of the present invention provides a system for predicting a remaining useful life (RUL) for a component in a set of components within a computer system. The system starts by collecting values of at least one degradation-related parameter associated with the operation of a monitored component within the computer system. Note that the degradation-related parameter is a direct measurement of a degree of degradation of the monitored component. The system additionally collects values of at least one stress-based parameter from the computer system. Note that the stress-based parameter measures an accumulative stress in the operating environment of the set of components which can cause degradation of the set of components. The system then uses the values of the at least one degradation-related parameter and the values of the at least one stress-based parameter to predict an RUL for a component in the set of components.
摘要:
Some embodiments of the present invention provide a system for in-situ characterization of a solid-state light. First, a voltage and a current of the solid-state light source are monitored. Then, the health of the solid-state light source is characterized based on an analysis of the monitored voltage and current.
摘要:
One embodiment provides a technique for analyzing a target electromagnetic signal radiating from a monitored system. During the technique, the monitored system is positioned at a first locus of an ellipsoidal surface to amplify the target electromagnetic signal received at a second locus of the ellipsoidal surface. Next, the amplified target electromagnetic signal is monitored using an antenna positioned at the second locus of the ellipsoidal surface. Finally, the integrity of the monitored system is assessed by analyzing the amplified target electromagnetic signal monitored by the antenna.
摘要:
Some embodiments of the present invention provide a system that controls a temperature variation in a computer system. First, a performance parameter of the computer system is monitored. Next, a future temperature of the computer system is predicted based on the performance parameter. Then, a pitch of one or more blades in a cooling device in the computer system is adjusted based on the future temperature to control the temperature variation in the computer system.
摘要:
One embodiment provides a system that analyzes a target electromagnetic signal radiating from a monitored system. During operation, the system monitors the target electromagnetic signal using a near-isotropic antenna that includes a set of receiving surfaces arranged in a regular polyhedron. Next, the system obtains a set of received target electromagnetic signals from the receiving surfaces. Finally, the system assesses the integrity of the monitored system by separately analyzing each of the received target electromagnetic signals.
摘要:
One embodiment of the present invention provides a system that performs a real-time root-cause-analysis for a degradation event associated with a component under test. During operation, the system monitors a telemetry signal collected from the component, and while doing so, attempts to detect an anomaly in the telemetry signal. If an anomaly is detected in the telemetry signal, the system performs a failure analysis on the telemetry signal in real-time while the telemetry signal is degrading. Next, the system identifies a failure mechanism for the component based on the failure analysis.
摘要:
Some embodiments of the present invention provide a system that characterizes a response of a component in a computer system to vibrations generated by the computer system. First, the system measures the response of the component to vibrations in a frequency range while the component is located outside of the computer system. The system also measures vibrations generated by the computer system in the frequency range during operation of the computer system, wherein the vibrations are measured at a location in the computer system which is configured to receive the component. The system then characterizes the response of the component to vibrations generated by the computer system based on the measured response of the component to vibrations in the frequency range and the measured vibrations in the frequency range at the location.
摘要:
Some embodiments of the present invention provide a system that determines the reliability of an interconnect. During operation, connectors in the interconnect are categorized into a set of predetermined groups. Next, the reliability for selected groups in the set of predetermined groups is determined. Then, a reliability model for the interconnect is generated based on the selected groups and the reliability of the selected groups to determine the overall reliability of the interconnect.
摘要:
One embodiment of the present invention provides a system that dynamically adjusts data resolution during proactive-fault-monitoring in a computer system. During operation, the system temporarily stores high-resolution data for a telemetry signal from the computer system in a buffer. The system then generates low-resolution data for the telemetry signal from the high-resolution data. Next, the system monitors the low-resolution data, and while doing so, determines if an anomaly exists in the low-resolution data. If an anomaly exists in the low-resolution data, the system records the high-resolution data from the buffer on a storage device.