Abstract:
Memory, used by a computer to store data, is generally prone to faults, including permanent faults (i.e. relating to a lifetime of the memory hardware), and also transient faults (i.e. relating to some external cause) which are otherwise known as soft errors. Since soft errors can change the state of the data in the memory and thus cause errors in applications reading and processing the data, there is a desire to characterize the degree of vulnerability of the memory to soft errors. In particular, once the vulnerability for a particular memory to soft errors has been characterized, cost/reliability trade-offs can be determined, or soft error detection mechanisms (e.g. parity) may be selectively employed for the memory. In some cases, memory faults can be diagnosed by redundant execution and a diagnostic coverage may be determined.
Abstract:
In various examples, motifs, watermarks, and/or signature inputs are applied to a deep neural network (DNN) to detect faults in underlying hardware and/or software executing the DNN. Information corresponding to the motifs, watermarks, and/or signatures may be compared to the outputs of the DNN generated using the motifs, watermarks and/or signatures. When a the accuracy of the predictions are below a threshold, or do not correspond to the expected predictions of the DNN, the hardware and/or software may be determined to have a fault—such as a transient, an intermittent, or a permanent fault. Where a fault is determined, portions of the system that rely on the computations of the DNN may be shut down, or redundant systems may be used in place of the primary system. Where no fault is determined, the computations of the DNN may be relied upon by the system.
Abstract:
A method, computer readable medium, and system are disclosed for error coping. The method includes the steps of receiving, by a processing unit, a set of program instructions including a first program instruction that is responsive to error detection, detecting an error in a value of a first operand of the first program instruction, and determining that error coping execution is selectively enabled for the first instruction. The value for the first operand is replaced with a substitute value and the first program instruction is executed by the processing unit.
Abstract:
In various examples, motifs, watermarks, and/or signature inputs are applied to a deep neural network (DNN) to detect faults in underlying hardware and/or software executing the DNN. Information corresponding to the motifs, watermarks, and/or signatures may be compared to the outputs of the DNN generated using the motifs, watermarks and/or signatures. When a the accuracy of the predictions are below a threshold, or do not correspond to the expected predictions of the DNN, the hardware and/or software may be determined to have a fault—such as a transient, an intermittent, or a permanent fault. Where a fault is determined, portions of the system that rely on the computations of the DNN may be shut down, or redundant systems may be used in place of the primary system. Where no fault is determined, the computations of the DNN may be relied upon by the system.
Abstract:
Memory, used by a computer to store data, is generally prone to faults, including permanent faults (i.e. relating to a lifetime of the memory hardware), and also transient faults (i.e. relating to some external cause) which are otherwise known as soft errors. Since soft errors can change the state of the data in the memory and thus cause errors in applications reading and processing the data, there is a desire to characterize the degree of vulnerability of the memory to soft errors. In particular, once the vulnerability for a particular memory to soft errors has been characterized, cost/reliability trade-offs can be determined, or soft error detection mechanisms (e.g. parity) may be selectively employed for the memory. A method, computer readable medium, and system are provided for using liveness as a factor to evaluate memory vulnerability to soft errors.
Abstract:
Memory, used by a computer to store data, is generally prone to faults, including permanent faults (i.e. relating to a lifetime of the memory hardware), and also transient faults (i.e. relating to some external cause) which are otherwise known as soft errors. Since soft errors can change the state of the data in the memory and thus cause errors in applications reading and processing the data, there is a desire to characterize the degree of vulnerability of the memory to soft errors. In particular, once the vulnerability for a particular memory to soft errors has been characterized, cost/reliability trade-offs can be determined, or soft error detection mechanisms (e.g. parity) may be selectively employed for the memory. A method, computer readable medium, and system are provided for using liveness as a factor to evaluate memory vulnerability to soft errors.
Abstract:
Memory, used by a computer to store data, is generally prone to faults, including permanent faults (i.e. relating to a lifetime of the memory hardware), and also transient faults (i.e. relating to some external cause) which are otherwise known as soft errors. Since soft errors can change the state of the data in the memory and thus cause errors in applications reading and processing the data, there is a desire to characterize the degree of vulnerability of the memory to soft errors. In particular, once the vulnerability for a particular memory to soft errors has been characterized, cost/reliability trade-offs can be determined, or soft error detection mechanisms (e.g. parity) may be selectively employed for the memory. In some cases, memory faults can be diagnosed by redundant execution and a diagnostic coverage may be determined.
Abstract:
Memory, used by a computer to store data, is generally prone to faults, including permanent faults (i.e. relating to a lifetime of the memory hardware), and also transient faults (i.e. relating to some external cause) which are otherwise known as soft errors. Since soft errors can change the state of the data in the memory and thus cause errors in applications reading and processing the data, there is a desire to characterize the degree of vulnerability of the memory to soft errors. In particular, once the vulnerability for a particular memory to soft errors has been characterized, cost/reliability trade-offs can be determined, or soft error detection mechanisms (e.g. parity) may be selectively employed for the memory. In some cases, memory faults can be diagnosed by redundant execution and a diagnostic coverage may be determined.
Abstract:
A system, method, and computer program product for generating flow-control signals for a processing pipeline is disclosed. The method includes the steps of generating, by a first pipeline stage, a delayed ready signal based on a downstream ready signal received from a second pipeline stage and a throttle disable signal. A downstream valid signal is generated by the first pipeline stage based on an upstream valid signal and the delayed ready signal. An upstream ready signal is generated by the first pipeline stage based on the delayed ready signal and the downstream valid signal.
Abstract:
In various examples, motifs, watermarks, and/or signature inputs are applied to a deep neural network (DNN) to detect faults in underlying hardware and/or software executing the DNN. Information corresponding to the motifs, watermarks, and/or signatures may be compared to the outputs of the DNN generated using the motifs, watermarks and/or signatures. When a the accuracy of the predictions are below a threshold, or do not correspond to the expected predictions of the DNN, the hardware and/or software may be determined to have a fault—such as a transient, an intermittent, or a permanent fault. Where a fault is determined, portions of the system that rely on the computations of the DNN may be shut down, or redundant systems may be used in place of the primary system. Where no fault is determined, the computations of the DNN may be relied upon by the system.