-
公开(公告)号:US20220308876A1
公开(公告)日:2022-09-29
申请号:US17214698
申请日:2021-03-26
申请人: Intel Corporation
摘要: Techniques and mechanisms for determining a relative order in which a load instruction and a store instruction are to be executed. In an embodiment, a processor detects an address collision event wherein two instructions, corresponding to different respective instruction pointer values, target the same memory address. Based on the address collision event, the processor identifies respective instruction types of the two instructions as an aliasing instruction type pair. The processor further determines a count of decisions each to forego a reversal of an order of execution of instructions. Each decision represented in the count is based on instructions which are each of a different respective instruction type of the aliasing instruction type pair. In another embodiment, the processor determines, based on the count of decisions, whether a later load instruction is to be advanced in an order of instruction execution.
-
公开(公告)号:US20200210339A1
公开(公告)日:2020-07-02
申请号:US16234135
申请日:2018-12-27
申请人: Intel Corporation
发明人: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC分类号: G06F12/0862
摘要: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US20190243684A1
公开(公告)日:2019-08-08
申请号:US15890984
申请日:2018-02-07
申请人: Intel Corporation
发明人: Pooja Roy , Jayesh Gaur , Sreenivas Subramoney , Zeev Sperber , Alexandr Titov , Lihu Rappoport , Stanislav Shwartsman , Hong Wang , Adi Yoaz , Ronak Singhal , Robert S. Chappell
摘要: A processor including an execution unit, an instruction scheduler circuit to identify a first instruction of an instruction stream, identify a second instruction on which execution of the first instruction depends, and assign a first dispatch priority value to the first instruction and the second instruction, and a dispatch circuit to dispatch, based on the first dispatch priority value, the first instruction and the second instruction to an instruction execution circuit.
-
公开(公告)号:US20190171515A1
公开(公告)日:2019-06-06
申请号:US15831195
申请日:2017-12-04
申请人: Intel Corporation
发明人: Zeev Sperber , Stanislav Shwartsman , Jared W. Stark, IV , Lihu Rappoport , Igor Yanover , George Leifman
CPC分类号: G06F11/0793 , G06F9/30043 , G06F9/30058 , G06F9/3802 , G06F9/3855 , G06F11/0721 , G06F12/0215 , G06F12/0253 , G06F2212/654 , G06F2212/702
摘要: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.
-
公开(公告)号:US10303605B2
公开(公告)日:2019-05-28
申请号:US15214895
申请日:2016-07-20
申请人: INTEL CORPORATION
发明人: Raanan Sade , Joseph Nuzman , Stanislav Shwartsman , Igor Yanover , Liron Zur
IPC分类号: G06F12/00 , G06F13/00 , G06F12/0815 , G06F12/0893
摘要: An example system on a chip (SoC) includes a processor, a cache, and a main memory. The SoC can include a first memory to store data in a memory line, wherein the memory line is set to an invalid state. The processor can include a processor coupled to the first memory. The processor can determine that a data size of a first data set received from an application is within a data size range. The processor can determine that an aggregate data size of the first data set and a second data set received from the application is at least a same data size as data size of the memory line. The processor can perform an invalid-to-modify (I2M) operation to change the memory line from the invalid state to a modified state. The processor can write the first data set and the second data set to the memory line.
-
6.
公开(公告)号:US09558127B2
公开(公告)日:2017-01-31
申请号:US14481266
申请日:2014-09-09
申请人: Intel Corporation
发明人: Stanislav Shwartsman , Robert S. Chappell , Ronak Singhal , Ryan L. Carlson , Raanan Sade , Omar M. Shaikh , Liron Zur , Yiftach Gilad
IPC分类号: G06F12/08
CPC分类号: G06F12/0897 , G06F12/0862 , G06F2212/1021 , G06F2212/402 , G06F2212/602
摘要: A processor includes a cache hierarchy and an execution unit. The cache hierarchy includes a lower level cache and a higher level cache. The execution unit includes logic to issue a memory operation to access the cache hierarchy. The lower level cache includes logic to determine that a requested cache line of the memory operation is unavailable in the lower level cache, determine that a line fill buffer of the lower level cache is full, and initiate prefetching of the requested cache line from the higher level cache based upon the determination that the line fill buffer of the lower level cache is full. The line fill buffer is to forward miss requests to the higher level cache.
摘要翻译: 处理器包括缓存层级和执行单元。 高速缓存层级包括较低级别的缓存和较高级别的高速缓存。 执行单元包括发出存储器操作以访问高速缓存层级的逻辑。 下级高速缓存包括确定存储器操作的所请求的高速缓存行在下级高速缓存中不可用的逻辑,确定较低级高速缓存的行填充缓冲区已满,并且从较高级缓存启动所请求的高速缓存行的预取 基于下级缓存的行填充缓冲器的确定已满的高级缓存。 行填充缓冲区是将错误请求转发到更高级别的缓存。
-
公开(公告)号:US20160103785A1
公开(公告)日:2016-04-14
申请号:US14881111
申请日:2015-10-12
申请人: Intel Corporation
发明人: Zeev Sperber , Robert Valentine , Guy Patkin , Stanislav Shwartsman , Shlomo Raikin , Igor Yanover , Gal Ofir
CPC分类号: G06F15/8007 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/30145 , G06F9/345 , G06F9/3887
摘要: Methods and apparatus are disclosed for using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode a scatter/gather instruction and generate a set of micro-operations, and an index array to hold a set of indices and a corresponding set of mask elements. A finite state machine facilitates the gather operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. An address is accessed to load a corresponding data element if the mask element had the first value. The data element is written at an in-register position in a destination vector register according to a respective in-register position the index. Values of corresponding mask elements are changed from the first value to a second value responsive to completion of their respective loads.
-
公开(公告)号:US20230409481A1
公开(公告)日:2023-12-21
申请号:US18320780
申请日:2023-05-19
申请人: Intel Corporation
发明人: Sreenivas Subramoney , Stanislav Shwartsman , Anant Nori , Shankar Balachandran , Elad Shtiegmann , Vineeth Mekkat , Manjunath Shevgoor , Sourabh Alurkar
IPC分类号: G06F12/0862
CPC分类号: G06F12/0862 , G06F2212/602
摘要: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
-
公开(公告)号:US11360770B2
公开(公告)日:2022-06-14
申请号:US16487784
申请日:2017-07-01
申请人: Intel Corporation
发明人: Robert Valentine , Menachem Adelman , Zeev Sperber , Mark J. Charney , Bret L. Toll , Jesus Corbal , Alexander F. Heinecke , Barukh Ziv , Elmoustapha Ould-Ahmed-Vall , Stanislav Shwartsman
摘要: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.
-
公开(公告)号:US11321469B2
公开(公告)日:2022-05-03
申请号:US16724105
申请日:2019-12-20
申请人: Intel Corporation
发明人: Michael E. Kounavis , Santosh Ghosh , Sergej Deutsch , Michael LeMay , David M. Durham , Stanislav Shwartsman
IPC分类号: G06F9/30 , G06F12/06 , G06F21/79 , H04L9/08 , G06F9/50 , H04L9/14 , G06F9/48 , H04L9/06 , G06F9/455 , G06F21/60 , G06F12/0897 , G06F21/72 , G06F12/0875 , G06F12/0811 , G06F21/12 , G06F12/14 , G06F9/32 , G06F12/02 , G06F21/62
摘要: In one embodiment, a processor of a cryptographic computing system includes data cache units storing encrypted data and circuitry coupled to the data cache units. The circuitry accesses a sequence of cryptographic-based instructions to execute based on the encrypted data, decrypts the encrypted data based on a first pointer value, executes the cryptographic-based instruction using the decrypted data, encrypts a result of the execution of the cryptographic-based instruction based on a second pointer value, and stores the encrypted result in the data cache units. In some embodiments, the circuitry generates, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation. The circuitry also schedules the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation.
-
-
-
-
-
-
-
-
-