-
公开(公告)号:US12086591B2
公开(公告)日:2024-09-10
申请号:US17214698
申请日:2021-03-26
申请人: Intel Corporation
CPC分类号: G06F9/30043 , G06F9/3856
摘要: Techniques and mechanisms for determining a relative order in which a load instruction and a store instruction are to be executed. In an embodiment, a processor detects an address collision event wherein two instructions, corresponding to different respective instruction pointer values, target the same memory address. Based on the address collision event, the processor identifies respective instruction types of the two instructions as an aliasing instruction type pair. The processor further determines a count of decisions each to forego a reversal of an order of execution of instructions. Each decision represented in the count is based on instructions which are each of a different respective instruction type of the aliasing instruction type pair. In another embodiment, the processor determines, based on the count of decisions, whether a later load instruction is to be advanced in an order of instruction execution.
-
公开(公告)号:US12020033B2
公开(公告)日:2024-06-25
申请号:US17133899
申请日:2020-12-24
申请人: Intel Corporation
CPC分类号: G06F9/3836 , G06F9/223 , G06F9/3838
摘要: Apparatus and method for memorizing repeat function calls are described herein. An apparatus embodiment includes: uop buffer circuitry to identify a function for memorization based on retiring micro-operations (uops) from a processing pipeline; memorization retirement circuitry to generate a signature of the function which includes input and output data of the function; a memorization data structure to store the signature; and predictor circuitry to detect an instance of the function to be executed by the processing pipeline and to responsively exclude a first subset of uops associated with the instance from execution when a confidence level associated with the function is above a threshold. One or more instructions that are data-dependent on execution of the instance is then provided with the output data of the function from the memorization data structure.
-
公开(公告)号:US11972126B2
公开(公告)日:2024-04-30
申请号:US17472272
申请日:2021-09-10
申请人: Intel Corporation
发明人: David M. Durham , Michael D. LeMay , Sergej Deutsch , Joydeep Rakshit , Anant Vithal Nori , Jayesh Gaur , Sreenivas Subramoney
IPC分类号: G06F3/06 , G06F12/02 , G06F12/1027
CPC分类号: G06F3/0631 , G06F3/0604 , G06F3/0659 , G06F3/0679 , G06F12/0238 , G06F12/1027
摘要: Technologies disclosed herein provide one example of a system that includes processor circuitry to be communicatively coupled to a memory circuitry. The processor circuitry is to receive a memory access request corresponding to an application for access to an address range in a memory allocation of the memory circuitry and to locate a metadata region within the memory allocation. The processor circuitry is also to, in response to a determination that the address range includes at least a portion of the metadata region, obtain first metadata stored in the metadata region, use the first metadata to determine an alternate memory address in a relocation region, and read, at the alternate memory address, displaced data from the portion of the metadata region included in the address range of the memory allocation. The address range includes one or more bytes of an expected allocation region of the memory allocation.
-
公开(公告)号:US20220197643A1
公开(公告)日:2022-06-23
申请号:US17133618
申请日:2020-12-23
申请人: Intel Corporation
发明人: Jayesh Gaur , Adarsh Chauhan , Vinodh Gopal , Vedvyas Shanbhogue , Sreenivas Subramoney , Wajdi Feghali
IPC分类号: G06F9/30 , G06F12/0875
摘要: Methods and apparatus relating to speculative decompression within processor core caches are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into a plurality of cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the plurality of cachelines of the cache of the processor core in response to the second micro operation. The decompression instruction causes the DE circuitry to perform an out-of-order decompression of the plurality of cachelines. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US10754655B2
公开(公告)日:2020-08-25
申请号:US16021838
申请日:2018-06-28
申请人: Intel Corporation
发明人: Adarsh Chauhan , Hong Wang , Jayesh Gaur , Zeev Sperber , Sumeet Bandishte , Lihu Rappoport , Stanislav Shwartsman , Kamil Garifullin , Sreenivas Subramoney , Adi Yoaz
摘要: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.
-
6.
公开(公告)号:US20180285268A1
公开(公告)日:2018-10-04
申请号:US15475197
申请日:2017-03-31
申请人: Intel Corporation
发明人: Kunal Kishore Korgaonkar , Ishwar S. Bhati , Huichu Liu , Jayesh Gaur , Sasikanth Manipatruni , Sreenivas Subramoney , Tanay Karnik , Hong Wang , Ian A. Young
IPC分类号: G06F12/0811 , G06F12/0808 , G06F12/1045 , G06F13/40
摘要: In one embodiment, a processor comprises a processing core, a last level cache (LLC), and a mid-level cache. The mid-level cache is to determine that an idle indicator has been set, wherein the idle indicator is set based on an amount of activity at the LLC, and based on the determination that the idle indicator has been set, identify a first cache line to be evicted from a first set of cache lines of the mid-level cache and send a request to write the first cache line to the LLC.
-
公开(公告)号:US20180232311A1
公开(公告)日:2018-08-16
申请号:US15430765
申请日:2017-02-13
申请人: INTEL CORPORATION
发明人: Ishwar S. Bhati , Huichu Liu , Jayesh Gaur , Kunal Korgaonkar , Sasikanth Manipatruni , Sreenivas Subramoney , Tanay Karnik , Hong Wang , Ian A. Young
IPC分类号: G06F12/0831 , G06F12/0875 , G06F12/0811
CPC分类号: G06F13/1642 , G06F12/0811
摘要: A processor includes a processing core and a cache controller including a read queue and a separate write queue. The read queue is to buffer read requests of the processing core to a non-volatile memory, last level cache (NVM-LLC), and the write queue is to buffer write requests to the NVM-LLC. The cache controller is to detect whether the write queue is full. The cache controller further prioritizes a first order of sending requests to the NVM-LLC when the write queue contains an empty slot, the first order specifying a first pattern of sending the read requests before the write requests, and prioritizes a second order of sending requests to the NVM-LLC in response to a determination that the write queue is full, the second order specifying a second pattern of alternating between sending a write request from the write queue and a read request from the read queue.
-
公开(公告)号:US20180203799A1
公开(公告)日:2018-07-19
申请号:US15408731
申请日:2017-01-18
申请人: Intel Corporation
发明人: Jayesh Gaur , Ayan Mandal , Anant Nori , Sreenivas Subramoney
IPC分类号: G06F12/0811
CPC分类号: G06F12/0811 , G06F11/34 , G06F12/0804 , G06F12/084 , G06F12/0888 , G06F2212/1016 , G06F2212/502
摘要: A memory-efficient last level cache (LLC) architecture is described. A processor implementing a LLC architecture may include a processor core, a last level cache (LLC) operatively coupled to the processor core, and a cache controller operatively coupled to the LLC. The cache controller is to monitor a bandwidth demand of a channel between the processor core and a dynamic random-access memory (DRAM) device associated with the LLC. The cache controller is further to perform a first defined number of consecutive reads from the DRAM device when the bandwidth demand exceeds a first threshold value and perform a first defined number of consecutive writes of modified lines from the LLC to the DRAM device when the bandwidth demand exceeds the first threshold value.
-
公开(公告)号:US20180011790A1
公开(公告)日:2018-01-11
申请号:US15206589
申请日:2016-07-11
申请人: Intel Corporation
IPC分类号: G11C11/406 , G06F15/78
CPC分类号: G06F12/0808 , G06F12/0811 , G06F12/0831 , G06F12/0864 , G06F12/0897 , G06F15/781 , G06F2212/283 , G06F2212/621 , G11C11/40615
摘要: An apparatus includes a cache controller, the cache controller to receive, from a requestor, a memory access request referencing a memory address of a memory. The cache controller may identify a cache entry associated with the memory address, and responsive to determining that a first data item stored in the cache entry matches a data pattern indicating cache entry invalidity, read a second data item from a memory location identified by the memory address. The cache controller may then return, to the requestor, a response comprising the second data item.
-
10.
公开(公告)号:US20230195465A1
公开(公告)日:2023-06-22
申请号:US17558368
申请日:2021-12-21
申请人: Intel Corporation
发明人: Stanislav Shwartsman , Elad Shtiegmann , Sumeet Bandishte , Lihu Rappoport , Zeev Sperber , Jayesh Gaur
CPC分类号: G06F9/3802 , G06F9/3818 , G06F9/30032
摘要: Techniques and mechanisms for efficiently making value prediction information available for use by in a processor. In an embodiment, the instruction execution is to include a loading of some data to a first location (e.g., a first register). A decoder of the processor accesses reference information which indicates that the execution is to comprise multiple micro-operations (μops) including a LoadCheck μop and a Move μop. The LoadCheck μop loads a first value to the first location, and checks whether the loaded first value is the same as a previously-determined second value which represents a prediction of what the first value would be. The Move μop moves the second value to the first location. In another embodiment, the Move μop is scheduled for execution out-of-order with respect to the LoadCheck μop, resulting in an early availability of the second value for access in a register file by another μop.
-
-
-
-
-
-
-
-
-