Patent search ap:("Intel Corporation") AND inv:"Steffen Kosinski" Page 1

1.

发明授权
Two-level cache locking mechanism 有权
Title translation: 两级缓存锁定机制

公开(公告)号：US09558121B2

公开(公告)日：2017-01-31

申请号：US13729840

申请日：2012-12-28

Applicant: INTEL CORPORATION

Inventor： Li-Gao Zei , Fernando Latorre , Steffen Kosinski , Jaroslaw Topp , Varun Mohandru , Lutz Naethke

IPC: G06F12/00 , G06F12/08 , G06F12/10

CPC classification number: G06F12/0846 , G06F12/0864 , G06F12/1063

Abstract: A virtually tagged cache may be configured to index virtual address entries in the cache into lockable sets based on a page offset value. When a memory operation misses on the virtually tagged cache, only the one set of virtual address entries with the same page offset may be locked. Thereafter, this general lock may be released and only an address stored in the physical tag array matching the physical address and a virtual address in the virtual tag array corresponding to the matching address stored in the physical tag array may be locked to reduce the amount and duration of locked addresses. The machine may be stalled only if a particular memory address request hits and/or tries to access one or more entries in a locked set. Devices, systems, methods, and computer readable media are provided.

Abstract translation: 虚拟标记的高速缓存可以被配置为基于页面偏移值将高速缓存中的虚拟地址条目索引到可锁定集合。当内存操作错过虚拟标记的缓存时，只有一组具有相同页偏移量的虚拟地址条目可能被锁定。此后，可以解除该通用锁定，并且仅锁定与物理地址匹配的物理标签阵列中存储的地址和与物理标签阵列中存储的匹配地址相对应的虚拟标签阵列中的虚拟地址，以减少数量和锁定地址的持续时间。只有当特定的存储器地址请求命中和/或尝试访问锁定集中的一个或多个条目时，才可能停止该机器。提供了设备，系统，方法和计算机可读介质。

2.

发明授权
Adaptive data prefetching 有权
Title translation: 自适应数据预取

公开(公告)号：US09280474B2

公开(公告)日：2016-03-08

申请号：US13976325

申请日：2013-01-03

Applicant: Intel Corporation

Inventor： Demos Pavlou , Pedro Lopez , Mirem Hyuseinova , Fernando Latorre , Steffen Kosinski , Ralf Goettsche , Varun K. Mohandru

IPC: G06F12/08 , G06F12/02 , G06F9/06 , G06F9/30 , G06F9/345 , G06F9/38

CPC classification number: G06F12/0862 , G06F9/06 , G06F9/30 , G06F9/3455 , G06F9/383 , G06F12/02 , G06F2212/6026

Abstract: A system and method for adaptive data prefetching in a processor enables adaptive modification of parameters associated with a prefetch operation. A stride pattern in successive addresses of a memory operation may be detected, including determining a stride length (L). Prefetching of memory operations may be based on a prefetch address determined from a base memory address, the stride length L, and a prefetch distance (D). A number of prefetch misses may be counted at a miss prefetch count (C). Based on the value of the miss prefetch count C, the prefetch distance D may be modified. As a result of adaptive modification of the prefetch distance D, an improved rate of cache hits may be realized.

Abstract translation: 用于处理器中自适应数据预取的系统和方法使得能够对与预取操作相关联的参数进行自适应修改。可以检测存储器操作的连续地址中的步幅图案，包括确定步幅长度（L）。存储器操作的预取可以基于从基本存储器地址确定的预取地址，步幅长度L和预取距离（D）。可以以错误预取计数（C）计数多个预取缺失。基于缺省预取计数C的值，可以修改预取距离D. 作为预取距离D的自适应修改的结果，可以实现改进的高速缓存命中率。

3.

发明申请
ADAPTIVE DATA PREFETCHING 有权
Title translation: 自适应数据预制

公开(公告)号：US20150143057A1

公开(公告)日：2015-05-21

申请号：US13976325

申请日：2013-01-03

Applicant: Intel Corporation

Inventor： Demos Pavlou , Pedro Lopez , Mirem Hyuseinova , Fernando Latorre , Steffen Kosinski , Ralf Goettsche , Varun K. Mohandru

IPC: G06F12/08

CPC classification number: G06F12/0862 , G06F9/06 , G06F9/30 , G06F9/3455 , G06F9/383 , G06F12/02 , G06F2212/6026

Abstract: A system and method for adaptive data prefetching in a processor enables adaptive modification of parameters associated with a prefetch operation. A stride pattern in successive addresses of a memory operation may be detected, including determining a stride length (L). Prefetching of memory operations may be based on a prefetch address determined from a base memory address, the stride length L, and a prefetch distance (D). A number of prefetch misses may be counted at a miss prefetch count (C). Based on the value of the miss prefetch count C, the prefetch distance D may be modified. As a result of adaptive modification of the prefetch distance D, an improved rate of cache hits may be realized.

Abstract translation: 用于处理器中自适应数据预取的系统和方法使得能够对与预取操作相关联的参数进行自适应修改。可以检测存储器操作的连续地址中的步幅图案，包括确定步幅长度（L）。存储器操作的预取可以基于从基本存储器地址确定的预取地址，步幅长度L和预取距离（D）。可以以错误预取计数（C）计数多个预取缺失。基于缺省预取计数C的值，可以修改预取距离D. 作为预取距离D的自适应修改的结果，可以实现改进的高速缓存命中率。

4.

发明授权
Method and apparatus for managing application state in a network interface controller in a high performance computing system 有权

公开(公告)号：US09973417B2

公开(公告)日：2018-05-15

申请号：US15256390

申请日：2016-09-02

Applicant: Intel Corporation

Inventor： Keith D. Underwood , Steffen Kosinski , Jaroslaw Topp , Jan Norden , Michael Redeker

IPC: H04L29/04 , H04L12/721 , G06F15/167 , G06F13/38 , H04L12/773 , H04L1/12 , H04L12/26 , H04L12/801 , H04L29/08 , H04L1/18

CPC classification number: H04L45/38 , G06F13/385 , G06F15/167 , H04L1/12 , H04L1/1835 , H04L43/103 , H04L45/60 , H04L47/34 , H04L67/10 , Y02D10/14 , Y02D10/151

Abstract: Methods related to communication between and within nodes in a high performance computing system are presented. Processing time for message exchange between a processing unit and a network controller interface in a node is reduced. Resources required to manage application state in the network interface controller are minimized. In the network interface controller, multiple contexts are multiplexed into one physical Direct Memory Access engine. Virtual to physical address translation in the network interface controller is accelerated by using a plurality of independent caches, with each level of the page table hierarchy cached in an independent cache. A memory management scheme for data structures distributed between the processing unit and the network controller interface is provided. The state required to implement end-to-end reliability is reduced by limiting the transmit sequence number space to the currently in-flight messages.

5.

发明授权
Store forwarding for data caches 有权
Title translation: 存储转发数据缓存

公开(公告)号：US09507725B2

公开(公告)日：2016-11-29

申请号：US13729945

申请日：2012-12-28

Applicant: INTEL CORPORATION

Inventor： Steffen Kosinski , Fernando Latorre , Niranjan Cooray , Stanislav Shwartsman , Ethan Kalifon , Varun Mohandru , Pedro Lopez , Tom Aviram-Rosenfeld , Jaroslav Topp , Li-Gao Zei

IPC: G06F12/00 , G06F12/08

CPC classification number: G06F12/0895 , G06F12/0855 , G06F12/0866

Abstract: A bit or other vector may be used to identify whether an address range entered into an intermediate buffer corresponds to most recently updated data associated with the address range. A bit or other vector may also be used to identify whether an address range entered into an intermediate buffer overlaps with an address range of data that is to be loaded. A processing device may then determine whether to obtain data that is to be loaded entirely from a cache, entirely from an intermediate buffer which temporarily buffers data destined for a cache until the cache is ready to accept the data, or from both the cache and the intermediate buffer depending on the particular vector settings. Systems, devices, methods, and computer readable media are provided.

Abstract translation: 可以使用位或其他向量来识别输入中间缓冲器的地址范围是否对应于与地址范围相关联的最近更新的数据。还可以使用位或其他向量来识别输入中间缓冲器的地址范围是否与要加载的数据的地址范围重叠。然后，处理设备可以完全从中间缓冲区获得要从缓存中完全加载的数据，该中间缓冲器临时缓冲目的地为高速缓存的数据，直到高速缓存准备好接受数据，或者从高速缓存和中间缓冲区取决于特定的向量设置。提供了系统，设备，方法和计算机可读介质。

6.

发明授权
Power efficient level one data cache access with pre-validated tags 有权
Title translation: 高效的一级数据缓存访问与预先验证的标签

公开(公告)号：US09311239B2

公开(公告)日：2016-04-12

申请号：US13976313

申请日：2013-03-14

Applicant: Intel Corporation

Inventor： Niranjan Cooray , Steffen Kosinski , Rami May , Doron Gershon , Jaroslaw Topp , Varun Mohandru

IPC: G06F12/00 , G06F12/08 , G06F12/10

CPC classification number: G06F12/0811 , G06F12/0864 , G06F12/10 , G06F12/1027 , G06F12/1045 , G06F2212/283 , G06F2212/6032 , G06F2212/65 , G06F2212/681 , Y02D10/13

Abstract: A system and method to implement a tag structure for a cache memory that includes a multi-way, set-associative translation lookaside buffer. The tag structure may store vectors in an L1 tag array to enable an L1 tag lookup that has fewer bits per entry and consumes less power. The vectors may identify entries in a translation lookaside buffer tag array. When a virtual memory address associated with a memory access instruction hits in the translation lookaside buffer, the translation lookaside buffer may generate a vector identifying the set and the way of the translation lookaside buffer entry that matched. This vector may then be compared to a group of vectors stored in a set of the L1 tag arrays to determine whether the virtual memory address hits in the L1 cache.

Abstract translation: 一种用于实现高速缓冲存储器的标签结构的系统和方法，其包括多路组合关联翻译后备缓冲器。标签结构可以将向量存储在L1标签阵列中，以便能够进行每个条目具有较少位的L1标签查找并消耗更少的功率。向量可以标识翻译后备缓冲器标签阵列中的条目。当与存储器访问指令相关联的虚拟存储器地址在翻译后备缓冲器中时，翻译后备缓冲器可以生成标识集合的向量和匹配的翻译后备缓冲器条目的方式。然后将该向量与存储在一组L1标签阵列中的一组矢量进行比较，以确定虚拟存储器地址是否在L1高速缓存中命中。

7.

发明申请
Power Efficient Level One Data Cache Access With Pre-Validated Tags 有权
Title translation: 具有预验证标签的高效一级数据缓存访问

公开(公告)号：US20150220436A1

公开(公告)日：2015-08-06

申请号：US13976313

申请日：2013-03-14

Applicant: Intel Corporation

Inventor： Niranjan Cooray , Steffen Kosinski , Rami May , Doron Gershon , Jaroslaw Topp , Varun Mohandru

IPC: G06F12/08 , G06F12/10

CPC classification number: G06F12/0811 , G06F12/0864 , G06F12/10 , G06F12/1027 , G06F12/1045 , G06F2212/283 , G06F2212/6032 , G06F2212/65 , G06F2212/681 , Y02D10/13

Abstract: A system and method to implement a tag structure for a cache memory that includes a multi-way, set-associative translation lookaside buffer. The tag structure may store vectors in an L1 tag array to enable an L1 tag lookup that has fewer bits per entry and consumes less power. The vectors may identify entries in a translation lookaside buffer tag array. When a virtual memory address associated with a memory access instruction hits in the translation lookaside buffer, the translation lookaside buffer may generate a vector identifying the set and the way of the translation lookaside buffer entry that matched. This vector may then be compared to a group of vectors stored in a set of the L1 tag arrays to determine whether the virtual memory address hits in the L1 cache.

Abstract translation: 一种用于实现高速缓冲存储器的标签结构的系统和方法，其包括多路组合关联翻译后备缓冲器。标签结构可以将向量存储在L1标签阵列中，以便能够进行每个条目具有较少位的L1标签查找并消耗更少的功率。向量可以标识翻译后备缓冲器标签阵列中的条目。当与存储器访问指令相关联的虚拟存储器地址在翻译后备缓冲器中时，翻译后备缓冲器可以生成标识集合的向量和匹配的翻译后备缓冲器条目的方式。然后将该向量与存储在一组L1标签阵列中的一组矢量进行比较，以确定虚拟存储器地址是否在L1高速缓存中命中。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification