-
公开(公告)号:US11048506B2
公开(公告)日:2021-06-29
申请号:US16450897
申请日:2019-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Krishnan V. Ramani , Kai Troester , Frank C. Galloway , David N. Suggs , Michael D. Achenbach , Betty Ann McDaniel , Marius Evers
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.
-
公开(公告)号:US20190310845A1
公开(公告)日:2019-10-10
申请号:US16450897
申请日:2019-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Krishnan V. Ramani , Kai Troester , Frank C. Galloway , David N. Suggs , Michael D. Achenbach , Betty Ann McDaniel , Marius Evers
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.
-
公开(公告)号:US10331357B2
公开(公告)日:2019-06-25
申请号:US15380778
申请日:2016-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Betty Ann McDaniel , Michael D. Achenbach , David N. Suggs , Frank C. Galloway , Kai Troester , Krishnan V. Ramani
IPC: G06F3/06 , G06F12/0871 , G06F12/0897 , G06F9/30
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. The system and method include storing data in one or more memory dependent architectural register numbers (MdArns), allocating the one or more MdArns to a MEMFILE, writing the allocated one or more MdArns to a map file, wherein the map file contains a MdArn map to enable subsequent access to an entry in the MEMFILE, upon receipt of a load request, checking a base, an index, a displacement and a match/hit via the map file to identify an entry in the MEMFILE and an associated store, and on a hit, providing the entry responsive to the load request from the one or more MdArns.
-
公开(公告)号:US20180052613A1
公开(公告)日:2018-02-22
申请号:US15380778
申请日:2016-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Betty Ann McDaniel , Michael D. Achenbach , David N. Suggs , Frank C. Galloway , Kai Troester , Krishnan V. Ramani
IPC: G06F3/06 , G06F12/0871 , G06F12/0897
CPC classification number: G06F3/0611 , G06F3/0631 , G06F3/0643 , G06F3/0659 , G06F3/0673 , G06F9/30 , G06F12/0871 , G06F12/0897 , G06F2212/1024 , G06F2212/304 , G06F2212/463 , G06F2212/604
Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. The system and method include storing data in one or more memory dependent architectural register numbers (MdArns), allocating the one or more MdArns to a MEMFILE, writing the allocated one or more MdArns to a map file, wherein the map file contains a MdArn map to enable subsequent access to an entry in the MEMFILE, upon receipt of a load request, checking a base, an index, a displacement and a match/hit via the map file to identify an entry in the MEMFILE and an associated store, and on a hit, providing the entry responsive to the load request from the one or more MdArns.
-
公开(公告)号:US20150067389A1
公开(公告)日:2015-03-05
申请号:US14014220
申请日:2013-08-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Frank C. Galloway
CPC classification number: G06F9/3016 , G06F9/30014 , G06F9/30072 , G06F9/3013 , G06F11/36 , G06F11/3636 , G06F11/3644
Abstract: The apparatuses, systems, and methods in accordance with the embodiments disclosed herein may facilitate modifying post silicon instruction behavior. Embodiments herein may provide registers in predetermined locations in an integrated circuit. These registers may be mapped to generic instructions, which can modify an operation of the integrated circuit. In some embodiments, these registers may be used to implement a patch routine to change the behavior of at least a portion of the integrated circuit. In this manner, the original design of the integrated circuit may be altered.
Abstract translation: 根据本文公开的实施例的装置,系统和方法可以有助于修改后硅指令行为。 这里的实施例可以在集成电路中的预定位置提供寄存器。 这些寄存器可以被映射到通用指令,这可以修改集成电路的操作。 在一些实施例中,这些寄存器可用于实现修补程序以改变集成电路的至少一部分的行为。 以这种方式,可以改变集成电路的原始设计。
-
公开(公告)号:US12039337B2
公开(公告)日:2024-07-16
申请号:US17032494
申请日:2020-09-25
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Robert B. Cohen , Tzu-Wei Lin , Anthony J. Bybell , Bill Kai Chiu Kwan , Frank C. Galloway
CPC classification number: G06F9/3804 , G06F9/30058 , G06F9/3822 , G06F9/3867
Abstract: A processor employs a plurality of fetch and decode pipelines by dividing an instruction stream into instruction blocks with identified boundaries. The processor includes a branch predictor that generates branch predictions. Each branch prediction corresponds to a branch instruction and includes a prediction that the corresponding branch is to be taken or not taken. In addition, each branch prediction identifies both an end of the current branch prediction window and the start of another branch prediction window. Using these known boundaries, the processor provides different sequential fetch streams to different ones of the plurality of fetch and decode states, which concurrently process the instructions of the different fetch streams, thereby improving overall instruction throughput at the processor.
-
-
-
-
-