-
公开(公告)号:US10228956B2
公开(公告)日:2019-03-12
申请号:US15282266
申请日:2016-09-30
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Mark J. Dechene , Zhongying Zhang , Jason Agron , Sebastian Winkel
Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.
-
公开(公告)号:US20180095765A1
公开(公告)日:2018-04-05
申请号:US15282266
申请日:2016-09-30
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Mark J. Dechene , Zhongying Zhang , Jason Agron , Sebastian Winkel
CPC classification number: G06F9/455 , G06F8/443 , G06F8/4432 , G06F8/4434 , G06F8/4435 , G06F8/445 , G06F8/52 , G06F9/30043 , G06F9/30145 , G06F9/3017 , G06F9/30181 , G06F9/30185 , G06F9/3834
Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.
-
公开(公告)号:US20180074827A1
公开(公告)日:2018-03-15
申请号:US15265587
申请日:2016-09-14
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Youfeng Wu , Sebastian Winkel , Oleg Margulis
IPC: G06F9/38 , G06F9/30 , G06F12/0875
Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.
-
公开(公告)号:US20170286110A1
公开(公告)日:2017-10-05
申请号:US15087786
申请日:2016-03-31
Applicant: Intel Corporation
Inventor: Jason M. Agron , Alex Merrick , Vineeth Mekkat
CPC classification number: G06F9/30145 , G06F9/30167 , G06F9/3017 , G06F9/30181 , G06F9/3832 , G06F12/0875 , G06F2212/452
Abstract: A hardware-software co-designed processor includes a front end to decode an instruction, an execution unit to execute the instruction, an auxiliary cache to store auxiliary information for consumption during execution of the instruction, an instruction blender, and a retirement unit to retire the instruction. The auxiliary information may include long immediate values, non-working instructions for emulating an untranslated instruction stream, or execution hints, and is not decoded by the front end. The auxiliary cache includes circuitry to receive the auxiliary information from a binary translator, to store the auxiliary information in the auxiliary cache, and to provide the auxiliary information to the instruction blender prior to execution. The instruction blender includes circuitry to receive the auxiliary information, to blend the instruction with the auxiliary information, and to provide the blended instruction to the execution unit. Use of the auxiliary cache may reduce fetch and decode bandwidth requirements.
-
15.
公开(公告)号:US09710389B2
公开(公告)日:2017-07-18
申请号:US14643354
申请日:2015-03-10
Applicant: INTEL CORPORATION
Inventor: Oleg Margulis , Sumit Ahuja , Polychronis Xekalakis , Yongjun Park , Vineeth Mekkat , Igor Yanover , Sebastian Winkel , Ethan Schuchman
IPC: G06F12/06 , G06F12/0875 , G06F9/38 , G06F9/46
CPC classification number: G06F12/0875 , G06F9/38 , G06F9/3834 , G06F9/3838 , G06F9/3855 , G06F9/467 , G06F2212/1008 , G06F2212/452
Abstract: A processor and method are described for alias detection. For example, one embodiment of an apparatus comprises: reordering logic to receive a set of read and write operations in a program order and to responsively reorder the read and write operations; adjustment information attachment logic to associate adjustment information with one or more of the set of read and write operations, wherein for a read operation the adjustment information is to indicate a number of write operations which the read operation has bypassed and for a write operation the adjustment information is to indicate a number of read operations which have bypassed the write operation; and out-of-order processing logic to determine whether execution of the reordered read and write operations will result in a conflict based, at least in part, on the adjustment information associated with the one or more reads and writes.
-
公开(公告)号:US12216581B2
公开(公告)日:2025-02-04
申请号:US18320780
申请日:2023-05-19
Applicant: Intel Corporation
Inventor: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC: G06F12/0862
Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US20210365377A1
公开(公告)日:2021-11-25
申请号:US17391962
申请日:2021-08-02
Applicant: Intel Corporation
Inventor: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC: G06F12/0862
Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US11080194B2
公开(公告)日:2021-08-03
申请号:US16234135
申请日:2018-12-27
Applicant: Intel Corporation
Inventor: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC: G06F12/0862
Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US10853078B2
公开(公告)日:2020-12-01
申请号:US16231313
申请日:2018-12-21
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Mark Dechene , Zhongying Zhang , John Faistl , Janghaeng Lee , Hou-Jen Ko , Sebastian Winkel , Oleg Margulis
Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.
-
公开(公告)号:US10296343B2
公开(公告)日:2019-05-21
申请号:US15474666
申请日:2017-03-30
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Jason M. Agron , Youfeng Wu
Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.
-
-
-
-
-
-
-
-
-