-
公开(公告)号:US12216581B2
公开(公告)日:2025-02-04
申请号:US18320780
申请日:2023-05-19
Applicant: Intel Corporation
Inventor: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC: G06F12/0862
Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US12153925B2
公开(公告)日:2024-11-26
申请号:US17130706
申请日:2020-12-22
Applicant: Intel Corporation
Inventor: Niranjan Soundararajan , Sreenivas Subramoney
Abstract: An embodiment of an integrated circuit may comprise a core, a front end unit coupled to the core to decode one or more instruction wherein the front end unit includes a first decode path, a second decode path, and circuitry to: predict a taken branch of a conditional branch instruction of the one or more instructions, decode a predicted path of the taken branch on the first decode path, determine if the conditional branch instruction corresponds to a hard-to-predict conditional branch instruction and if the second decode path is available and, if so determined, decode an alternate path of a not-taken branch of the hard-to-predict conditional branch instruction on the second decode path. Other embodiments are disclosed and claimed.
-
公开(公告)号:US12112171B2
公开(公告)日:2024-10-08
申请号:US17134367
申请日:2020-12-26
Applicant: Intel Corporation
Inventor: Anant Nori , Shankar Balachandran , Sreenivas Subramoney , Joydeep Rakshit , Vedvyas Shanbhogue , Avishaii Abuhatzera , Belliappa Kuttanna
CPC classification number: G06F9/30145 , G06F9/30065 , G06F9/3836 , G06F9/4881
Abstract: Techniques for processing loops are described. An exemplary apparatus at least includes decoder circuitry to decode a single instruction, the single instruction to include a field for an opcode, the opcode to indicate execution circuitry is to perform an operation to configure execution of one or more loops, wherein the one or more loops are to include a plurality of configuration instructions and instructions that are to use metadata generated by ones of the plurality of configuration instructions; and execution circuitry to perform the operation as indicated by the opcode.
-
公开(公告)号:US20230195464A1
公开(公告)日:2023-06-22
申请号:US17553780
申请日:2021-12-16
Applicant: Intel Corporation
Inventor: Anant Vithal Nori , Prathmesh Kallurkar , Sreenivas Subramoney , Niranjan Kumar Soundararajan
IPC: G06F9/38
CPC classification number: G06F9/3802
Abstract: Methods and apparatus relating to throttling a code fetch for speculative code paths are described. In an embodiment, a first storage structure stores a reference to a code line in response to a request to be received from a cache. A second storage structure to store a reference to the code line in response to an update to an Instruction Dispatch Queue (IDQ). Logic circuitry controls additional code line fetch operations based at least in part on a comparison of a number of ongoing speculative code fetches and a determination that the code line is speculative. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11656971B2
公开(公告)日:2023-05-23
申请号:US17582051
申请日:2022-01-24
Applicant: Intel Corporation
Inventor: Adarsh Chauhan , Jayesh Gaur , Franck Sala , Lihu Rappoport , Zeev Sperber , Adi Yoaz , Sreenivas Subramoney
CPC classification number: G06F11/3476 , G06F9/24 , G06F9/3836 , G06F11/3024 , G06F11/3055 , G06F15/7875
Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.
-
公开(公告)号:US20230100693A1
公开(公告)日:2023-03-30
申请号:US17448795
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Saurabh Gupta , Ragavendra Natarajan , Niranjan K. Soundararajan , Jared W. Stark, IV , Sreenivas Subramoney
IPC: G06F9/38
Abstract: In an embodiment, a processor may include an execution circuit to execute a plurality of instructions. The processor may also include a prediction circuit to: in response to a detection of a first target instruction in a program, identify a prediction data entry associated with a path history for the first target instruction, the identified prediction data entry to indicate an offset distance from the first target instruction to a predicted next taken branch of the program; and determine the predicted next taken branch of the program based on the offset distance indicated by the identified prediction data entry. Other embodiments are described and claimed.
-
公开(公告)号:US20220206925A1
公开(公告)日:2022-06-30
申请号:US17582051
申请日:2022-01-24
Applicant: Intel Corporation
Inventor: Adarsh Chauhan , Jayesh Gaur , Franck Sala , Lihu Rappoport , Zeev Sperber , Adi Yoaz , Sreenivas Subramoney
Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.
-
8.
公开(公告)号:US20220197813A1
公开(公告)日:2022-06-23
申请号:US17133624
申请日:2020-12-23
Applicant: Intel Corporation
Inventor: Jayesh Gaur , Adarsh Chauhan , Vinodh Gopal , Vedvyas Shanbhogue , Sreenivas Subramoney , Wajdi Feghali
IPC: G06F12/0875 , G06F12/0813 , G06F12/0811 , G06F12/1045
Abstract: Methods and apparatus relating to techniques for increasing per core memory bandwidth by using forget store operations are described. In an embodiment, a cache stores a buffer. Execution circuitry executes an instruction. The instruction causes one or more cachelines in the cache to be marked based on a start address for the buffer and a size of the buffer. A marked cacheline in the cache is to be prevented from being written back to memory. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20210365377A1
公开(公告)日:2021-11-25
申请号:US17391962
申请日:2021-08-02
Applicant: Intel Corporation
Inventor: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC: G06F12/0862
Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US11080194B2
公开(公告)日:2021-08-03
申请号:US16234135
申请日:2018-12-27
Applicant: Intel Corporation
Inventor: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC: G06F12/0862
Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
-
-
-
-
-
-
-
-