-
公开(公告)号:US11726546B2
公开(公告)日:2023-08-15
申请号:US17033000
申请日:2020-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Vedula Venkata Srikant Bharadwaj , Shomit N. Das , Anthony T. Gutierrez , Vignesh Adhinarayanan
IPC: G06F1/00 , G06F1/3287 , G06F9/50 , G06F1/3296 , G06F1/324
CPC classification number: G06F1/3287 , G06F1/324 , G06F1/3296 , G06F9/50
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
-
公开(公告)号:US11119665B2
公开(公告)日:2021-09-14
申请号:US16212388
申请日:2018-12-06
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Shomit N. Das , Kishore Punniyamurthy
IPC: G06F3/06
Abstract: A processing system scales power to memory and memory channels based on identifying causes of stalls of threads of a wavefront. If the cause is other than an outstanding memory request, the processing system throttles power to the memory to save power. If the stall is due to memory stalls for a subset of the memory channels servicing memory access requests for threads of a wavefront, the processing system adjusts power of the memory channels servicing memory access request for the wavefront based on the subset. By boosting power to the subset of channels, the processing system enables the wavefront to complete processing more quickly, resulting in increased processing speed. Conversely, by throttling power to the remainder of channels, the processing system saves power without affecting processing speed.
-
公开(公告)号:US11061429B2
公开(公告)日:2021-07-13
申请号:US15795214
申请日:2017-10-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Greg Sadowski , Shomit N. Das
IPC: G06F1/08 , G01R31/317 , G01R31/28 , H03K5/00 , G11C7/22
Abstract: A technique for fine-granularity speed binning for a processing device is provided. The processing device includes a plurality of clock domains, each of which may be clocked with independent clock signals. The clock frequency at which a particular clock domain may operate is determined based on the longest propagation delay between clocked elements in that particular clock domain. The processing device includes measurement circuits for each clock domain that measure such propagation delay. The measurement circuits are replica propagation delay paths of actual circuit elements within each particular clock domain. A speed bin for each clock domain is determined based on the propagation delay measured for the measurement circuits for a particular clock domain. Specifically, a speed bin is chosen that is associated with the fastest clock speed whose clock period is longer than the slowest propagation delay measured for the measurement circuit for the clock domain.
-
公开(公告)号:US10944693B2
公开(公告)日:2021-03-09
申请号:US16188900
申请日:2018-11-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Srikant Bharadwaj , Shomit N. Das
IPC: H04L12/933 , H04L12/775
Abstract: A system is described that includes an integrated circuit chip having a network-on-chip. The network-on-chip includes multiple routers arranged in a topology and a separate communication link coupled between each router and each of one or more neighboring routers of that router among the multiple routers in the topology. The integrated circuit chip also includes multiple nodes, each node coupled to a router of the multiple routers. When operating, a given router of the multiple routers keeps a record of operating states of some or all of the multiple routers and corresponding communication links. The given router then routes flits to destination nodes via one or more other routers of the multiple routers based at least in part on the operating states of the some or all of the multiple routers and the corresponding communication links.
-
公开(公告)号:US10558606B1
公开(公告)日:2020-02-11
申请号:US16118172
申请日:2018-08-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Shomit N. Das , Matthew Tomei , Shrikanth Ganapathy , John Kalamatianos
IPC: G06F13/42 , G06F1/3296 , H03M13/05
Abstract: Systems, apparatuses, and methods for reliably transmitting data over voltage scaled links are disclosed. A computing system includes at least first and second devices connected via a link. In one implementation, if a data block can be compressed to less than or equal to half the original size of the data block, then the data block is compressed and sent on the link in a single clock cycle rather than two clock cycles. If the data block cannot be compressed to half the original size, but if the data block can be compressed enough to include error correction code (ECC) bits without exceeding the original size, then ECC bits are added to the compressed block which is sent on the link at a reduced voltage. The ECC bits help to correct for any errors that are generated as a result of operating the link at the reduced voltage.
-
公开(公告)号:US20180121312A1
公开(公告)日:2018-05-03
申请号:US15338172
申请日:2016-10-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Greg Sadowski , Steven E. Raasch , Shomit N. Das , Wayne Burleson
CPC classification number: G06F11/008 , G06F11/3058 , G06F11/3452 , Y02D10/34
Abstract: A system and method for managing operating parameters within a system for optimal power and reliability are described. A device includes a functional unit and a corresponding reliability evaluator. The functional unit provides reliability information to one or more reliability monitors, which translate the information to reliability values. The reliability evaluator determines an overall reliability level for the system based on the reliability values. The reliability monitor compares the actual usage values and the expected usage values. When system has maintained a relatively high level of reliability for a given time interval, the reliability evaluator sends an indication to update operating parameters to reduce reliability of the system, which also reduces power consumption for the system.
-
公开(公告)号:US11362673B2
公开(公告)日:2022-06-14
申请号:US17089360
申请日:2020-11-04
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Seyedmohammad Seyedzadehdelcheh , Shomit N. Das
IPC: H03M7/30 , G06F17/16 , G06F40/151
Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.
-
公开(公告)号:US20220100257A1
公开(公告)日:2022-03-31
申请号:US17033000
申请日:2020-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Vedula Venkata Srikant Bharadwaj , Shomit N. Das , Anthony T. Gutierrez , Vignesh Adhinarayanan
IPC: G06F1/3287 , G06F1/324 , G06F1/3296 , G06F9/50
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
-
29.
公开(公告)号:US11194634B2
公开(公告)日:2021-12-07
申请号:US16220827
申请日:2018-12-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Karthik Rao , Shomit N. Das , Xudong An , Wei Huang
IPC: G06F9/46 , G06F9/50 , G06F9/48 , G06F9/38 , H04L29/08 , G06F1/3206 , G06F13/40 , G06F3/06 , H04N19/436
Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.
-
公开(公告)号:US11144208B2
公开(公告)日:2021-10-12
申请号:US16724609
申请日:2019-12-23
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: SeyedMohammad Seyedzadehdelcheh , Xianwei Zhang , Bradford Beckmann , Shomit N. Das
IPC: G06K9/36 , G06F3/06 , G06F12/0875 , G06T1/20
Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
-
-
-
-
-
-
-
-
-