-
公开(公告)号:US10331537B2
公开(公告)日:2019-06-25
申请号:US15389573
申请日:2016-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Gupta , Vilas Sridharan , David A. Roberts
IPC: G06F11/00 , G06F11/34 , G06F12/0891
Abstract: Described herein are waterfall counters and an application to architectural vulnerability factor (AVF) estimation. Waterfall counters count events that are generated at event generation logic. The waterfall counters are a combination of small, fast counters local to the event generation logic, and larger, global counters in fast memory. The local counters can be saturation or oscillation counters. When a local counter is saturated or evicted, the value from the local counter is added to the global counter. This addition can be done using logic local to the local or global counter. The waterfall counters provide a full-accuracy event count without the high bandwidth that is needed to maintain the global counters. An AVF estimation can be determined based on ratios from counts of read events, write events, and total events using the waterfall counters.
-
公开(公告)号:US10042687B2
公开(公告)日:2018-08-07
申请号:US15231251
申请日:2016-08-08
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Manish Gupta
Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.
-
公开(公告)号:US20180039531A1
公开(公告)日:2018-02-08
申请号:US15231251
申请日:2016-08-08
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Manish Gupta
CPC classification number: G06F11/0763 , G06F9/30021 , G06F9/30101 , G06F9/3851 , G06F9/3861 , G06F9/3887 , G06F11/0721 , G06F11/0784
Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.
-
4.
公开(公告)号:US20170277441A1
公开(公告)日:2017-09-28
申请号:US15331270
申请日:2016-10-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Gupta , David A. Roberts , Mitesh R. Meswani , Vilas Sridharan , Steven Raasch , Daniel I. Lowell
IPC: G06F3/06
CPC classification number: G06F12/02
Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
-
5.
公开(公告)号:US10365996B2
公开(公告)日:2019-07-30
申请号:US15331270
申请日:2016-10-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Gupta , David A. Roberts , Mitesh R. Meswani , Vilas Sridharan , Steven Raasch , Daniel I. Lowell
Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
-
6.
公开(公告)号:US20180181492A1
公开(公告)日:2018-06-28
申请号:US15389573
申请日:2016-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Gupta , Vilas Sridharan , David A. Roberts
IPC: G06F12/0891
CPC classification number: G06F11/34 , G06F12/0891 , G06F12/12 , G06F2201/885 , G06F2212/1032 , G06F2212/60
Abstract: Described herein are waterfall counters and an application to architectural vulnerability factor (AVF) estimation. Waterfall counters count events that are generated at event generation logic. The waterfall counters are a combination of small, fast counters local to the event generation logic, and larger, global counters in fast memory. The local counters can be saturation or oscillation counters. When a local counter is saturated or evicted, the value from the local counter is added to the global counter. This addition can be done using logic local to the local or global counter. The waterfall counters provide a full-accuracy event count without the high bandwidth that is needed to maintain the global counters. An AVF estimation can be determined based on ratios from counts of read events, write events, and total events using the waterfall counters.
-
公开(公告)号:US10303472B2
公开(公告)日:2019-05-28
申请号:US15359236
申请日:2016-11-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Manish Gupta
Abstract: Systems, apparatuses, and methods for implementing bufferless communication for redundant multithreading applications using register permutation are disclosed. In one embodiment, a system includes a parallel processing unit, a register file, and a scheduler. The scheduler is configured to cause execution of a plurality of threads to be performed in lockstep on the parallel processing unit. The plurality of threads include a first thread and a second thread executing on adjacent first and second lanes, respectively, of the parallel processing unit. The second thread is configured to perform a register permute operation from a first register location to a second register location in a first instruction cycle, with the second register location associated with the second processing lane. The second thread is configured to read from the second register location in a second instruction cycle, wherein the first and second instruction cycles are successive instruction cycles.
-
公开(公告)号:US20180143829A1
公开(公告)日:2018-05-24
申请号:US15359236
申请日:2016-11-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Manish Gupta
CPC classification number: G06F9/30032 , G06F9/3824 , G06F9/3851 , G06F9/3861 , G06F9/3887 , G06F9/46 , G06F11/00
Abstract: Systems, apparatuses, and methods for implementing bufferless communication for redundant multithreading applications using register permutation are disclosed. In one embodiment, a system includes a parallel processing unit, a register file, and a scheduler. The scheduler is configured to cause execution of a plurality of threads to be performed in lockstep on the parallel processing unit. The plurality of threads include a first thread and a second thread executing on adjacent first and second lanes, respectively, of the parallel processing unit. The second thread is configured to perform a register permute operation from a first register location to a second register location in a first instruction cycle, with the second register location associated with the second processing lane. The second thread is configured to read from the second register location in a second instruction cycle, wherein the first and second instruction cycles are successive instruction cycles.
-
-
-
-
-
-
-