-
公开(公告)号:US10698692B2
公开(公告)日:2020-06-30
申请号:US15216094
申请日:2016-07-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Greg Sadowski , John Kalamatianos , Shomit N. Das
IPC: G06F9/38
Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
-
公开(公告)号:US20180024837A1
公开(公告)日:2018-01-25
申请号:US15216094
申请日:2016-07-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Greg Sadowski , John Kalamatianos , Shomit N. Das
Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
-
公开(公告)号:US12210398B2
公开(公告)日:2025-01-28
申请号:US18346380
申请日:2023-07-03
Applicant: Advanced Micro Devices, Inc.
Inventor: Vedula Venkata Srikant Bharadwaj , Shomit N. Das , Anthony T. Gutierrez , Vignesh Adhinarayanan
IPC: G06F1/32 , G06F1/26 , G06F1/324 , G06F1/3287 , G06F1/3296 , G06F9/50
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
-
公开(公告)号:US12169782B2
公开(公告)日:2024-12-17
申请号:US16425403
申请日:2019-05-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Shomit N. Das , Abhinav Vishnu
Abstract: A processor determines losses of samples within an input volume that is provided to a neural network during a first epoch, groups the samples into subsets based on losses, and assigns the subsets to operands in the neural network that represent the samples at different precisions. Each subset is associated with a different precision. The processor then processes the subsets in the neural network at the different precisions during the first epoch. In some cases, the samples in the subsets are used in a forward pass and a backward pass through the neural network. A memory configured to store information representing the samples in the subsets at the different precisions. In some cases, the processor stores information representing model parameters of the neural network in the memory at the different precisions of the subsets of the corresponding samples.
-
公开(公告)号:US12001237B2
公开(公告)日:2024-06-04
申请号:US17029158
申请日:2020-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Matthew Tomei , Shomit N. Das , David A. Wood
IPC: G06F12/00 , G06F3/06 , G06F12/0802
CPC classification number: G06F3/0608 , G06F3/0655 , G06F3/0676 , G06F3/0679 , G06F12/0802
Abstract: Systems, methods, and devices for performing pattern-based cache block compression and decompression. An uncompressed cache block is input to the compressor. Byte values are identified within the uncompressed cache block. A cache block pattern is searched for in a set of cache block patterns based on the byte values. A compressed cache block is output based on the byte values and the cache block pattern. A compressed cache block is input to the decompressor. A cache block pattern is identified based on metadata of the cache block. The cache block pattern is applied to a byte dictionary of the cache block. An uncompressed cache block is output based on the cache block pattern and the byte dictionary. A subset of cache block patterns is determined from a training cache trace based on a set of compressed sizes and a target number of patterns for each size.
-
公开(公告)号:US11842199B2
公开(公告)日:2023-12-12
申请号:US16913146
申请日:2020-06-26
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Greg Sadowski , John Kalamatianos , Shomit N. Das
IPC: G06F9/38
CPC classification number: G06F9/3871 , G06F9/3836 , G06F9/3869
Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
-
公开(公告)号:US20230110376A1
公开(公告)日:2023-04-13
申请号:US18058534
申请日:2022-11-23
Applicant: Advanced Micro Devices, Inc.
IPC: G06F12/0871 , G06F11/30 , G06F12/0897 , G06F12/02
Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.
-
公开(公告)号:US20200210343A1
公开(公告)日:2020-07-02
申请号:US16232314
申请日:2018-12-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Matthew J. Tomei , Philip B. Bedoukian , Shomit N. Das
IPC: G06F12/0897 , G06F12/0815
Abstract: An electronic device includes at least one compression-decompression functional block and a hierarchy of cache memories with a first cache memory and a second cache memory. The at least one compression-decompression functional block receives data in an uncompressed state, compresses the data using one of a first compression or a second compression, and, after compressing the data, provides the data to the first cache memory for storage therein. When the data is retrieved from the first cache memory to be stored in the second cache memory, when the data is compressed using the first compression, the compression-decompression functional block decompresses the data to reverse effects of the first compression on the data, thereby restoring the data to the uncompressed state and provides the data compressed using the second compression or in the uncompressed state to the second cache memory for storage therein.
-
公开(公告)号:US10411731B1
公开(公告)日:2019-09-10
申请号:US16140025
申请日:2018-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Shomit N. Das , Matthew Tomei
IPC: H03M7/34 , H03M7/40 , H03M7/30 , H03M13/00 , H03M5/00 , G06T9/00 , H03M7/00 , H04N19/134 , H04N19/103
Abstract: A processing device is provided which includes a plurality of encoders each configured to compress a portion of data using a different compression algorithm. The processing device also includes one or more processors configured to cause an encoder, of the plurality of encoders, to compress the portion of data when it is determined that the portion of data, which is compressed by another encoder configured to compress the portion of data prior to the encoder in an encoder hierarchy, is not successfully compressed according to a compression metric by the other encoder in the encoder hierarchy. The one or more processors are also configured to prevent the encoder from compressing the portion of data when it is determined that the portion of data is successfully compressed according to the compression metric by the other encoder in the encoder hierarchy.
-
公开(公告)号:US11740791B2
公开(公告)日:2023-08-29
申请号:US17497286
申请日:2021-10-08
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Seyed Mohammad Seyedzadehdelcheh , Xianwei Zhang , Bradford Beckmann , Shomit N. Das
IPC: G06F3/06 , G06F12/0875 , G06T1/20
CPC classification number: G06F3/0608 , G06F3/064 , G06F3/0659 , G06F3/0673 , G06F12/0875 , G06F2212/1044 , G06T1/20
Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
-
-
-
-
-
-
-
-
-