-
公开(公告)号:US20230021492A1
公开(公告)日:2023-01-26
申请号:US17385783
申请日:2021-07-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena , John Kalamatianos
IPC: G06F12/0891 , G06F12/0811 , G06F12/02 , G06F13/16
Abstract: A technical solution to the technical problem of how to support memory-centric operations on cached data uses a novel memory-centric memory operation that invokes write back functionality on cache controllers and memory controllers. The write back functionality enforces selective flushing of dirty, i.e., modified, cached data that is needed for memory-centric memory operations from caches to the completion level of the memory-centric memory operations, and updates the coherence state appropriately at each cache level. The technical solution ensures that commands to implement the selective cache flushing are ordered before the memory-centric memory operation at the completion level of the memory-centric memory operation.
-
公开(公告)号:US11481331B2
公开(公告)日:2022-10-25
申请号:US17135832
申请日:2020-12-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish Kotra , John Kalamatianos
IPC: G06F12/0862
Abstract: An electronic device includes a processor having a cache memory, a plurality of physical registers, and a promotion logic functional block. The promotion logic functional block promotes prefetched data from a portion of a cache block in the cache memory into a given physical register, the promoting including storing the prefetched data in the given physical register. Upon encountering a load micro-operation that loads data from the portion of the cache block into a destination physical register, the promotion logic functional block sets the processor so that the prefetched data stored in the given physical register is provided to micro-operations that depend on the load micro-operation.
-
公开(公告)号:US20220261350A1
公开(公告)日:2022-08-18
申请号:US17730754
申请日:2022-04-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish Kotra , John Kalamatianos
IPC: G06F12/0862
Abstract: An electronic device includes a processor, the processor having a cache memory, a set of physical registers, and a promotion logic functional block. When one or more promotion conditions are met, the promotion logic functional block promotes prefetched data from a portion of a cache block in the cache memory to a physical register among the set of physical registers. For promoting the prefetched data, the promotion logic functional block acquires the prefetched data from the portion of the cache block and stores the prefetched data in the physical register
-
公开(公告)号:US11309911B2
公开(公告)日:2022-04-19
申请号:US16542872
申请日:2019-08-16
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander D. Breslow , Nuwan Jayasena , John Kalamatianos
Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.
-
公开(公告)号:US11243884B2
公开(公告)日:2022-02-08
申请号:US16190111
申请日:2018-11-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Susumu Mashimo , John Kalamatianos
IPC: G06F12/0862
Abstract: A method of prefetching target data includes, in response to detecting a lock-prefixed instruction for execution in a processor, determining a predicted target memory location for the lock-prefixed instruction based on control flow information associating the lock-prefixed instruction with the predicted target memory location. Target data is prefetched from the predicted target memory location to a cache coupled with the processor, and after completion of the prefetching, the lock-prefixed instruction is executed in the processor using the prefetched target data.
-
公开(公告)号:US11023242B2
公开(公告)日:2021-06-01
申请号:US15417555
申请日:2017-01-27
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: John Kalamatianos , Greg Sadowski , Syed Zohaib M. Gilani
IPC: G06F9/38 , G06F9/30 , G06F15/80 , G06F12/0875
Abstract: A method and apparatus of asynchronous scheduling in a graphics device includes sending one or more instructions from an instruction scheduler to one or more instruction first-in/first-out (FIFO) devices. An instruction in the one or more FIFO devices is selected for execution by a single-instruction/multiple-data (SIMD) pipeline unit. It is determined whether all operands for the selected instruction are available for execution of the instruction, and if all the operands are available, the selected instruction is executed on the SIMD pipeline unit. The self-timed arithmetic pipeline unit (SIMD pipeline unit) is effectively encapsulated in a synchronous, (e.g., clocked by global clock), scheduler and register file environment.
-
公开(公告)号:US20210056036A1
公开(公告)日:2021-02-25
申请号:US16544468
申请日:2019-08-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexander D. Breslow , John Kalamatianos
IPC: G06F12/0895 , H03M7/30
Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.
-
公开(公告)号:US10908991B2
公开(公告)日:2021-02-02
申请号:US16123489
申请日:2018-09-06
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Shrikanth Ganapathy
IPC: G06F11/10 , G06F12/0815 , G06F12/0804
Abstract: A computing device having a cache memory that is configured in a write-back mode is described. A cache controller in the cache memory acquires, from a record of bit errors that are present in each of a plurality of portions of the cache memory, a number of bit errors in a portion of the cache memory. The cache controller detects a coherency state of data stored in the portion of the cache memory. Based on the coherency state and the number of bit errors, the cache controller selects an error protection from among a plurality of error protections. The cache controller uses the selected error protection to protect the data stored in the portion of the cache memory from errors.
-
公开(公告)号:US10558606B1
公开(公告)日:2020-02-11
申请号:US16118172
申请日:2018-08-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Shomit N. Das , Matthew Tomei , Shrikanth Ganapathy , John Kalamatianos
IPC: G06F13/42 , G06F1/3296 , H03M13/05
Abstract: Systems, apparatuses, and methods for reliably transmitting data over voltage scaled links are disclosed. A computing system includes at least first and second devices connected via a link. In one implementation, if a data block can be compressed to less than or equal to half the original size of the data block, then the data block is compressed and sent on the link in a single clock cycle rather than two clock cycles. If the data block cannot be compressed to half the original size, but if the data block can be compressed enough to include error correction code (ECC) bits without exceeding the original size, then ECC bits are added to the compressed block which is sent on the link at a reduced voltage. The ECC bits help to correct for any errors that are generated as a result of operating the link at the reduced voltage.
-
公开(公告)号:US20190187964A1
公开(公告)日:2019-06-20
申请号:US15848476
申请日:2017-12-20
Applicant: Advanced Micro Devices, Inc.
IPC: G06F8/41
CPC classification number: G06F8/4434 , G06F8/433
Abstract: Systems, apparatuses, and methods for converting computer program source code from a first high level language to a functionally equivalent executable program code. Source code in a first high level language is analyzed by a code compilation tool. In response to identifying a potential bank conflict in a multi-bank register file, operands of one or more instructions are remapped such that they map to different physical banks of the multi-bank register file. Identifying a potential bank conflict comprises one or more of identifying an intra-instruction bank conflict, an inter-instruction bank conflict, and identifying a multi-word operand with a potential bank conflict.
-
-
-
-
-
-
-
-
-