-
公开(公告)号:US10157063B2
公开(公告)日:2018-12-18
申请号:US13631402
申请日:2012-09-28
申请人: INTEL CORPORATION
发明人: Polychronis Xekalakis , Pedro Marcuello , Alejandro Vicente Martinez , Christos E. Kotselidis , Grigorios Magklis , Fernando Latorre , Raul Martinez , Josep M. Codina , Enric Gibert Codina , Crispin Gomez Requena , Antonio Gonzelez , Mirem Hyuseinova , Pedro Lopez , Marc Lupon , Carlos Madriles , Daniel Ortega , Demos Pavlou , Kyriakos A. Stavrou , Georgios Tournavitis
摘要: A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level.
-
公开(公告)号:US09870209B2
公开(公告)日:2018-01-16
申请号:US14228697
申请日:2014-03-28
申请人: Intel Corporation
发明人: John H. Kelm , Demos Pavlou , Mirem Hyuseinova
IPC分类号: G06F9/45 , G06F9/38 , G06F12/08 , G06F12/12 , G06F12/0804 , G06F12/0811
CPC分类号: G06F8/52 , G06F9/3834 , G06F9/3836 , G06F9/384 , G06F9/3842 , G06F9/3859 , G06F12/0804 , G06F12/0811 , G06F12/12 , G06F2212/1016 , G06F2212/621
摘要: A processor includes a resource scheduler, a dispatcher, and a memory execution unit. The memory execution unit includes logic to identify an executed, unretired store operation in a memory ordered buffer, determine that the store operation is speculative, determine whether an associated cache line in a data cache is non-speculative, and determine whether to block a write of the store operation results to the data cache based upon the determination that the store operation is speculative and a determination that the associated cache line is non-speculative.
-
公开(公告)号:US10409763B2
公开(公告)日:2019-09-10
申请号:US14319265
申请日:2014-06-30
申请人: Intel Corporation
发明人: Patrick P. Lai , Ethan Schuchman , David Keppel , Denis M. Khartikov , Polychronis Xekalakis , Joshua B. Fryman , Allan D. Knies , Naveen Neelakantam , Gregor Stellpflug , John H. Kelm , Mirem Hyuseinova Seidahmedova , Demos Pavlou , Jaroslaw Topp
摘要: Various different embodiments of the invention are described including: (1) a method and apparatus for intelligently allocating threads within a binary translation system; (2) data cache way prediction guided by binary translation code morphing software; (3) fast interpreter hardware support on the data-side; (4) out-of-order retirement; (5) decoupled load retirement in an atomic OOO processor; (6) handling transactional and atomic memory in an out-of-order binary translation based processor; and (7) speculative memory management in a binary translation based out of order processor.
-
4.
公开(公告)号:US20190004916A1
公开(公告)日:2019-01-03
申请号:US16026870
申请日:2018-07-03
申请人: Intel Corporation
发明人: Raul Martinez , Enric Gibert Codina , Pedro Lopez , Marti Torrents Lapuerta , Polychronis Xekalakis , Georgios Tournavitis , Kyriakos A. Stavrou , Demos Pavlou , Daniel Ortega , Alejandro Martinez Vicente , Pedro Marcuello , Grigorios Magklis , Josep M. Codina , Crispin Gomez Requena , Antonio Gonzalez , Mirem Hyuseinova , Christos Kotselidis , Fernando Latorre , Marc Lupon , Carlos Madriles
IPC分类号: G06F11/30 , G06F12/0862 , G06F11/34
摘要: A combination of hardware and software collect profile data for asynchronous events, at code region granularity. An exemplary embodiment is directed to collecting metrics for prefetching events, which are asynchronous in nature. Instructions that belong to a code region are identified using one of several alternative techniques, causing a profile bit to be set for the instruction, as a marker. Each line of a data block that is prefetched is similarly marked. Events corresponding to the profile data being collected and resulting from instructions within the code region are then identified. Each time that one of the different types of events is identified, a corresponding counter is incremented. Following execution of the instructions within the code region, the profile data accumulated in the counters are collected, and the counters are reset for use with a new code region.
-
公开(公告)号:US09374542B2
公开(公告)日:2016-06-21
申请号:US14228684
申请日:2014-03-28
申请人: Intel Corporation
发明人: Kyriakos Stavrou , Pedro Marcuello , Grigorios Magklis , Javier Carretero Casado , Juan Fernandez , Carlos Madriles , Daniel Ortega , Demos Pavlou
CPC分类号: H04N5/357 , H04N5/23229 , H04N5/378
摘要: An image signal processor is described. The image signal processor includes a block checking circuit. The block checking circuit comprises comparison circuitry to compare a block of luminous pixel values against respective blocks of luminous pixel values that are processed by the image signal processor after the block of luminous pixel values. The block checking circuitry further comprises circuitry to record an entry in a table if one of the blocks of respective luminous pixel values match the block of luminous pixel values. The image signal processor is to store an image signal processing resultant of the block of luminous pixel values and present the stored resultant as a respective resultant for the one of the blocks of respective luminous pixel values if the one of the blocks of respective luminous pixel values matches the block of pixel values.
摘要翻译: 描述图像信号处理器。 图像信号处理器包括块检查电路。 块检查电路包括比较电路,用于将发光像素值的块与在发光像素值的块之后由图像信号处理器处理的各个发光像素值进行比较。 块检查电路还包括用于在各个发光像素值的块之一与发光像素值的块匹配的情况下将条目记录在表中的电路。 图像信号处理器用于存储发光像素值块的图像信号处理结果,并且将存储的结果作为各个发光像素值的块中的一个的相应结果存在,如果各个发光像素值的块之一 匹配像素值块。
-
公开(公告)号:US09280474B2
公开(公告)日:2016-03-08
申请号:US13976325
申请日:2013-01-03
申请人: Intel Corporation
发明人: Demos Pavlou , Pedro Lopez , Mirem Hyuseinova , Fernando Latorre , Steffen Kosinski , Ralf Goettsche , Varun K. Mohandru
CPC分类号: G06F12/0862 , G06F9/06 , G06F9/30 , G06F9/3455 , G06F9/383 , G06F12/02 , G06F2212/6026
摘要: A system and method for adaptive data prefetching in a processor enables adaptive modification of parameters associated with a prefetch operation. A stride pattern in successive addresses of a memory operation may be detected, including determining a stride length (L). Prefetching of memory operations may be based on a prefetch address determined from a base memory address, the stride length L, and a prefetch distance (D). A number of prefetch misses may be counted at a miss prefetch count (C). Based on the value of the miss prefetch count C, the prefetch distance D may be modified. As a result of adaptive modification of the prefetch distance D, an improved rate of cache hits may be realized.
摘要翻译: 用于处理器中自适应数据预取的系统和方法使得能够对与预取操作相关联的参数进行自适应修改。 可以检测存储器操作的连续地址中的步幅图案,包括确定步幅长度(L)。 存储器操作的预取可以基于从基本存储器地址确定的预取地址,步幅长度L和预取距离(D)。 可以以错误预取计数(C)计数多个预取缺失。 基于缺省预取计数C的值,可以修改预取距离D. 作为预取距离D的自适应修改的结果,可以实现改进的高速缓存命中率。
-
公开(公告)号:US20150143057A1
公开(公告)日:2015-05-21
申请号:US13976325
申请日:2013-01-03
申请人: Intel Corporation
发明人: Demos Pavlou , Pedro Lopez , Mirem Hyuseinova , Fernando Latorre , Steffen Kosinski , Ralf Goettsche , Varun K. Mohandru
IPC分类号: G06F12/08
CPC分类号: G06F12/0862 , G06F9/06 , G06F9/30 , G06F9/3455 , G06F9/383 , G06F12/02 , G06F2212/6026
摘要: A system and method for adaptive data prefetching in a processor enables adaptive modification of parameters associated with a prefetch operation. A stride pattern in successive addresses of a memory operation may be detected, including determining a stride length (L). Prefetching of memory operations may be based on a prefetch address determined from a base memory address, the stride length L, and a prefetch distance (D). A number of prefetch misses may be counted at a miss prefetch count (C). Based on the value of the miss prefetch count C, the prefetch distance D may be modified. As a result of adaptive modification of the prefetch distance D, an improved rate of cache hits may be realized.
摘要翻译: 用于处理器中自适应数据预取的系统和方法使得能够对与预取操作相关联的参数进行自适应修改。 可以检测存储器操作的连续地址中的步幅图案,包括确定步幅长度(L)。 存储器操作的预取可以基于从基本存储器地址确定的预取地址,步幅长度L和预取距离(D)。 可以以错误预取计数(C)计数多个预取缺失。 基于缺省预取计数C的值,可以修改预取距离D. 作为预取距离D的自适应修改的结果,可以实现改进的高速缓存命中率。
-
-
-
-
-
-