-
公开(公告)号:US20190310845A1
公开(公告)日:2019-10-10
申请号:US16450897
申请日:2019-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Krishnan V. Ramani , Kai Troester , Frank C. Galloway , David N. Suggs , Michael D. Achenbach , Betty Ann McDaniel , Marius Evers
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.
-
公开(公告)号:US20200225956A1
公开(公告)日:2020-07-16
申请号:US16834834
申请日:2020-03-30
Applicant: Advanced Micro Devices, Inc.
Inventor: David N. Suggs
IPC: G06F9/38 , G06F12/0875 , G06F12/0855 , G06F12/0862
Abstract: A system and method for using an operation (op) cache is disclosed. The system and method include an op cache for caching previously decoded instructions. The op cache includes a plurality of physically indexed and tagged instructions allowing sharing of instructions between threads. The op cache is chained through multiple ways allowing service of a plurality of instructions in a cache line. The op cache is stored between a shared operation storage and immediate/displacement storage to maximize capacity.
-
公开(公告)号:US10606599B2
公开(公告)日:2020-03-31
申请号:US15374727
申请日:2016-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: David N. Suggs
IPC: G06F9/38 , G06F12/0875 , G06F12/0855 , G06F12/0862 , G06F9/30
Abstract: A system and method for using an operation (op) cache is disclosed. The system and method include an op cache for caching previously decoded instructions. The op cache includes a plurality of physically indexed and tagged instructions allowing sharing of instructions between threads. The op cache is chained through multiple ways allowing service of a plurality of instructions in a cache line. The op cache is stored between a shared operation storage and immediate/displacement storage to maximize capacity.
-
4.
公开(公告)号:US20140136822A1
公开(公告)日:2014-05-15
申请号:US13673244
申请日:2012-11-09
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: David N. Suggs , Luke Yen , Steven Beigelmacher
IPC: G06F9/38
CPC classification number: G06F9/381 , G06F9/30065 , G06F9/30145 , G06F9/325 , G06F9/3802 , G06F9/3808 , G06F9/3814 , G06F9/3867
Abstract: In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To execute the loop, the uOp buffer repeatedly dispatches the uOps associated with the loop's instructions until the end condition of the loop is met and the uOp buffer exits the loop mode.
Abstract translation: 在正常的非循环模式中,uOp缓冲器接收并存储用于根据接收的指令序列调度由解码级产生的uOps。 响应于检测到指令序列中的循环,将uOp缓冲器置于循环模式,由此在与循环相关联的uOps已经存储在uOp缓冲器之后,暂停在缓冲器处的进一步的uOps的存储。 要执行循环,uOp缓冲区会重复调度与循环指令相关联的uOps,直到满足循环的结束条件,并且uOp缓冲区退出循环模式。
-
公开(公告)号:US20230195517A1
公开(公告)日:2023-06-22
申请号:US17559251
申请日:2021-12-22
Applicant: Advanced Micro Devices, Inc.
Inventor: David N. Suggs
IPC: G06F9/48
CPC classification number: G06F9/4881
Abstract: A multi-cycle scheduler for a processor includes early wake circuitry, late wake circuitry, and picker circuitry. In a first cycle of a clock, the early wake circuitry speculatively identifies child micro-operations as ready whose dependencies are satisfied by a set of ready parent micro-operations. In a second cycle of the clock, the picker circuitry picks at least one of the child micro-operations identified as ready for issue to execution circuitry. In addition, the late wake circuitry blocks from issue at least one picked child micro-operation speculatively identified as ready upon determining that a respective parent micro-operation did not issue to execution circuitry.
-
公开(公告)号:US10331357B2
公开(公告)日:2019-06-25
申请号:US15380778
申请日:2016-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Betty Ann McDaniel , Michael D. Achenbach , David N. Suggs , Frank C. Galloway , Kai Troester , Krishnan V. Ramani
IPC: G06F3/06 , G06F12/0871 , G06F12/0897 , G06F9/30
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. The system and method include storing data in one or more memory dependent architectural register numbers (MdArns), allocating the one or more MdArns to a MEMFILE, writing the allocated one or more MdArns to a map file, wherein the map file contains a MdArn map to enable subsequent access to an entry in the MEMFILE, upon receipt of a load request, checking a base, an index, a displacement and a match/hit via the map file to identify an entry in the MEMFILE and an associated store, and on a hit, providing the entry responsive to the load request from the one or more MdArns.
-
公开(公告)号:US20180052613A1
公开(公告)日:2018-02-22
申请号:US15380778
申请日:2016-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Betty Ann McDaniel , Michael D. Achenbach , David N. Suggs , Frank C. Galloway , Kai Troester , Krishnan V. Ramani
IPC: G06F3/06 , G06F12/0871 , G06F12/0897
CPC classification number: G06F3/0611 , G06F3/0631 , G06F3/0643 , G06F3/0659 , G06F3/0673 , G06F9/30 , G06F12/0871 , G06F12/0897 , G06F2212/1024 , G06F2212/304 , G06F2212/463 , G06F2212/604
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. The system and method include storing data in one or more memory dependent architectural register numbers (MdArns), allocating the one or more MdArns to a MEMFILE, writing the allocated one or more MdArns to a map file, wherein the map file contains a MdArn map to enable subsequent access to an entry in the MEMFILE, upon receipt of a load request, checking a base, an index, a displacement and a match/hit via the map file to identify an entry in the MEMFILE and an associated store, and on a hit, providing the entry responsive to the load request from the one or more MdArns.
-
公开(公告)号:US09710276B2
公开(公告)日:2017-07-18
申请号:US13673244
申请日:2012-11-09
Applicant: Advanced Micro Devices, Inc.
Inventor: David N. Suggs , Luke Yen , Steven Beigelmacher
CPC classification number: G06F9/381 , G06F9/30065 , G06F9/30145 , G06F9/325 , G06F9/3802 , G06F9/3808 , G06F9/3814 , G06F9/3867
Abstract: In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To execute the loop, the uOp buffer repeatedly dispatches the uOps associated with the loop's instructions until the end condition of the loop is met and the uOp buffer exits the loop mode.
-
公开(公告)号:US11048506B2
公开(公告)日:2021-06-29
申请号:US16450897
申请日:2019-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Krishnan V. Ramani , Kai Troester , Frank C. Galloway , David N. Suggs , Michael D. Achenbach , Betty Ann McDaniel , Marius Evers
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.
-
公开(公告)号:US20180165096A1
公开(公告)日:2018-06-14
申请号:US15374727
申请日:2016-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: David N. Suggs
IPC: G06F9/38 , G06F12/0875
CPC classification number: G06F9/3808 , G06F9/3012 , G06F9/30152 , G06F9/30167 , G06F9/30174 , G06F9/3802 , G06F9/3806 , G06F9/3816 , G06F9/382 , G06F9/3826 , G06F9/3828 , G06F9/3836 , G06F9/3838 , G06F9/384 , G06F9/3842 , G06F9/3846 , G06F9/3851 , G06F9/3857 , G06F9/3885 , G06F9/3891 , G06F12/0855 , G06F12/0862 , G06F12/0875 , G06F2212/1024 , G06F2212/452 , G06F2212/6028
Abstract: A system and method for using an operation (op) cache is disclosed. The system and method include an op cache for caching previously decoded instructions. The op cache includes a plurality of physically indexed and tagged instructions allowing sharing of instructions between threads. The op cache is chained through multiple ways allowing service of a plurality of instructions in a cache line. The op cache is stored between a shared operation storage and immediate/displacement storage to maximize capacity.
-
-
-
-
-
-
-
-
-