-
公开(公告)号:US20210191869A1
公开(公告)日:2021-06-24
申请号:US16725971
申请日:2019-12-23
Applicant: Advanced Micro Devices, Inc.
IPC: G06F12/0871 , G06F12/0897 , G06F12/02 , G06F11/30
Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.
-
公开(公告)号:US20210089324A1
公开(公告)日:2021-03-25
申请号:US16913146
申请日:2020-06-26
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Greg Sadowski , John Kalamatianos , Shomit N. Das
IPC: G06F9/38
Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
-
公开(公告)号:US20200133866A1
公开(公告)日:2020-04-30
申请号:US16176828
申请日:2018-10-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Shomit N. Das , Matthew Tomei , David A. Wood
IPC: G06F12/0871 , G06F17/50 , H03M7/30
Abstract: The disclosure herein provides techniques for designing cache compression algorithms that control how data in caches are compressed. The techniques generate a custom “byte select algorithm” by applying repeated transforms applied to an initial compression algorithm until a set of suitability criteria is met. The suitability criteria include that the “cost” is below a threshold and that a metadata constraint is met. The “cost” is the number of blocks that can be compressed by an algorithm as compared with the “ideal” algorithm. The metadata constraint is the number of bits required for metadata.
-
公开(公告)号:US20200073845A1
公开(公告)日:2020-03-05
申请号:US16118172
申请日:2018-08-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Shomit N. Das , Matthew Tomei , Shrikanth Ganapathy , John Kalamatianos
Abstract: Systems, apparatuses, and methods for reliably transmitting data over voltage scaled links are disclosed. A computing system includes at least first and second devices connected via a link. In one implementation, if a data block can be compressed to less than or equal to half the original size of the data block, then the data block is compressed and sent on the link in a single clock cycle rather than two clock cycles. If the data block cannot be compressed to half the original size, but if the data block can be compressed enough to include error correction code (ECC) bits without exceeding the original size, then ECC bits are added to the compressed block which is sent on the link at a reduced voltage. The ECC bits help to correct for any errors that are generated as a result of operating the link at the reduced voltage.
-
公开(公告)号:US20190129463A1
公开(公告)日:2019-05-02
申请号:US15795214
申请日:2017-10-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Greg Sadowski , Shomit N. Das
IPC: G06F1/08 , G01R31/317 , G11C7/22
Abstract: A technique for fine-granularity speed binning for a processing device is provided. The processing device includes a plurality of clock domains, each of which may be clocked with independent clock signals. The clock frequency at which a particular clock domain may operate is determined based on the longest propagation delay between clocked elements in that particular clock domain. The processing device includes measurement circuits for each clock domain that measure such propagation delay. The measurement circuits are replica propagation delay paths of actual circuit elements within each particular clock domain. A speed bin for each clock domain is determined based on the propagation delay measured for the measurement circuits for a particular clock domain. Specifically, a speed bin is chosen that is associated with the fastest clock speed whose clock period is longer than the slowest propagation delay measured for the measurement circuit for the clock domain.
-
-
-
-