Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"John Kalamatianos"

111.

发明申请
Bit Error Protection in Cache Memories 审中-公开

公开(公告)号：US20180302105A1

公开(公告)日：2018-10-18

申请号：US15489438

申请日：2017-04-17

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Shrikanth Ganapathy , Steven Raasch

IPC: H03M13/11 , H03M13/00 , G11C16/08 , G11C11/56

CPC classification number: G06F11/1064 , G11C29/42 , H03M13/19

Abstract: A computing device having a cache memory (or “cache”) is described, as is a method for operating the cache. The method for operating the cache includes maintaining, in a history record, a representation of a number of bit errors detected in a portion of the cache. When the history record indicates that no bit errors or a single-bit bit error was detected in the portion of the cache memory, the method includes selecting, based on the history record, an error protection to be used for the portion of the cache memory. When the history record indicates that a multi-bit bit error was detected in the portion of the cache memory, the method includes disabling the portion of the cache memory.

112.

发明授权
Detecting multiple stride sequences for prefetching 有权
Title translation: 检测多个步幅序列进行预取

公开(公告)号：US09304919B2

公开(公告)日：2016-04-05

申请号：US13907209

申请日：2013-05-31

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Paul E. Keltcher

IPC: G06F12/08

CPC classification number: G06F12/0811 , G06F12/0862 , G06F2212/6026

Abstract: The present application describes some embodiments of a prefetcher that tracks multiple stride sequences for prefetching. Some embodiments of the prefetcher implement a method including generating a sum-of-strides for each of a plurality of stride lengths that are larger than one by summing a number of previous strides that is equal to the stride length. Some embodiments of the method also include prefetching data in response to repetition of one or more of the sum-of-strides for one or more of the plurality of stride lengths.

Abstract translation: 本申请描述了预取器的一些实施例，该预取器跟踪用于预取的多个步幅序列。预取器的一些实施例实现一种方法，该方法包括通过对等于步幅长度的先前步幅的数量进行求和来产生大于1的多个步幅长度中的每一个步长的总和。该方法的一些实施例还包括响应于对于多个步幅长度中的一个或多个步长的一个或多个步数的重复来预取数据。

113.

发明申请
DYNAMIC REMAPPING OF CACHE LINES 有权
Title translation: 高速缓存行动态更新

公开(公告)号：US20150293854A1

公开(公告)日：2015-10-15

申请号：US14253785

申请日：2014-04-15

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Johnsy Kanjirapallil John , Phillip E. Nevius , Robert G. Gelinas

IPC: G06F12/08 , G11C29/44

CPC classification number: G06F12/0864 , G06F11/1064 , G06F12/0802 , G06F12/0811 , G06F12/0873 , G06F12/0888 , G06F12/121 , G06F2212/1032 , G06F2212/28 , G06F2212/283 , G06F2212/284 , G06F2212/403 , G11C15/00 , G11C29/44 , G11C29/4401 , G11C2029/0409

Abstract: A method of managing cache memory includes accessing a cache memory at a primary index that corresponds to an address specified in an access request. A determination is made that accessing the cache memory at the primary index does not result in a cache hit on a cache line with an error-free status. In response to this determination, the primary index is mapped to a secondary index and data for the address is written to a cache line at the secondary index.

Abstract translation: 一种管理高速缓存存储器的方法包括以对应于在访问请求中指定的地址的主要索引访问高速缓冲存储器。确定在主索引处访问高速缓冲存储器不会导致在具有无错状态的高速缓存行上的高速缓存命中。响应于该确定，主索引被映射到次索引，并且该地址的数据被写入到次级索引处的高速缓存行。

114.

发明授权
Dynamic evaluation and reconfiguration of a data prefetcher 有权
Title translation: 数据预取器的动态评估和重新配置

公开(公告)号：US09058277B2

公开(公告)日：2015-06-16

申请号：US13671801

申请日：2012-11-08

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Sharad Dilip Bade , Alok Garg , John Kalamatianos , Paul Keltcher , Marius Evers , Chitresh Narasimhaiah

IPC: G06F12/08 , G06F9/38

CPC classification number: G06F12/0862 , G06F9/3842 , G06F11/30 , G06F2212/6024 , G06F2212/6026 , Y02D10/13

Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.

Abstract translation: 提供了用于为处理器预取数据的方法和系统。系统被配置用于并且方法包括选择处理器的第一预取控制逻辑和第二预取控制逻辑之一作为候选特征，当候选特征不活动时，在非活动采样周期捕获处理器的性能度量，当候选特征处于活动状态时，在活动采样周期捕获处理器的性能度量，比较处于活动和非活动采样周期的处理器的性能度量，并且将候选特征的状态设置为使能时的性能度量活动期间表示在非活动期间的性能指标改善，当非活动期间的性能指标表示改善了活动期间的绩效指标时被禁用。

115.

发明申请
VARIABLE DISTANCE BYPASS BETWEEN TAG ARRAY AND DATA ARRAY PIPELINES IN A CACHE 有权
Title translation: 标签阵列之间的可变距离旁路和缓存中的数据阵列管道

公开(公告)号：US20140365729A1

公开(公告)日：2014-12-11

申请号：US13912809

申请日：2013-06-07

Applicant: Advanced Micro Devices, Inc.

Inventor： Marius Evers , John Kalamatianos , Carl D. Dietz , Richard E. Klass , Ravindra N. Bhargava

IPC: G06F12/08

CPC classification number: G06F12/0855 , G06F12/0844 , G06F12/0846

Abstract: The present application describes embodiments of techniques for picking a data array lookup request for execution in a data array pipeline a variable number of cycles behind a corresponding tag array lookup request that is concurrently executing in a tag array pipeline. Some embodiments of a method for picking the data array lookup request include picking the data array lookup request for execution in a data array pipeline of a cache concurrently with execution of a tag array lookup request in a tag array pipeline of the cache. The data array lookup request is picked for execution in response to resources of the data array pipeline becoming available after picking the tag array lookup request for execution. Some embodiments of the method may be implemented in a cache.

Abstract translation: 本申请描述了用于在数据阵列流水线中选择用于执行数据阵列查找请求的技术的实施例，该数据阵列查找请求在标签阵列管线中同时执行的对应的标签数组查找请求后面的可变数量的循环。用于选择数据阵列查找请求的方法的一些实施例包括在高速缓存的标签阵列管线中执行标签阵列查找请求的同时，在高速缓存的数据阵列流水线中选择用于执行的数据阵列查找请求。选择数据数组查找请求以在执行标签数组查找请求之后响应于数据数组流水线变得可用的资源进行执行。该方法的一些实施例可以在高速缓存中实现。

116.

发明申请
TRACKING PREFETCHER ACCURACY AND COVERAGE 有权
Title translation: 跟踪提前准确性和覆盖

公开(公告)号：US20140173217A1

公开(公告)日：2014-06-19

申请号：US13720072

申请日：2012-12-19

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： John Kalamatianos , Paul Keltcher

IPC: G06F12/08

CPC classification number: G06F12/0862 , G06F2212/602 , G06F2212/6026

Abstract: A method, an apparatus, and a non-transitory computer readable medium for tracking accuracy and coverage of a prefetcher in a processor are presented. A table is maintained and indexed by an address, wherein each entry in the table corresponds to one address. A number of demand requests that hit in the table on a prefetch, a total number of demand requests, and a number of prefetch requests are counted. The accuracy of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the number of prefetch requests. The coverage of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the total number of demand requests. The table and the counters are reset when a reset condition is reached.

Abstract translation: 提出了一种用于跟踪处理器中的预取器的精度和覆盖率的方法，装置和非暂时计算机可读介质。表由地址维护和索引，其中表中的每个条目对应于一个地址。对预取中的表中的一些需求请求，需求请求的总数以及预取请求的数量进行计数。通过将预取中的表中命中的请求请求数除以预取请求数来计算预取器的准确性。预取器的覆盖率是通过将预取中的表中命中的请求请求数除以请求请求总数来计算的。当达到复位条件时，表和计数器被复位。

117.

发明公开
Managing a Cache Using Per Memory Region Reuse Distance Estimation 审中-公开

公开(公告)号：US20240211407A1

公开(公告)日：2024-06-27

申请号：US18146883

申请日：2022-12-27

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Jagadish B. Kotra , Asmita Pal

IPC: G06F12/0888

CPC classification number: G06F12/0888 , G06F2212/1016

Abstract: A memory request issue counter (MRIC) is maintained that is incremented for every memory access a central processing unit core makes. A region reuse distance table is also maintained that includes multiple entries each of which stores the region reuse distance for a corresponding region. When a memory access request for a physical address is received, a reuse distance for the physical address is calculated. This reuse distance is the difference between the current MRIC value and a previous MRIC value for the physical address. The previous MRIC value for the physical address is the MRIC value the MRIC had when a memory access request for the physical address was last received. A region reuse distance for a region that includes the physical address is generated based on the reuse distance for the physical address and used to manage the cache.

118.

发明授权
Dispatch bandwidth of memory-centric requests by bypassing storage array address checking 有权

公开(公告)号：US12019547B2

公开(公告)日：2024-06-25

申请号：US17386115

申请日：2021-07-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Jagadish B. Kotra , John Kalamatianos , Gagandeep Panwar

IPC: G06F12/08 , G06F9/30 , G06F12/02 , G06F12/06 , G06F12/0815 , G06F12/0817

CPC classification number: G06F12/0815 , G06F9/30047 , G06F12/0238 , G06F12/0653 , G06F12/0817

Abstract: A technical solution to the technical problem of how to improve dispatch throughput for memory-centric commands bypasses address checking for certain memory-centric commands. Implementations include using an Address Check Bypass (ACB) bit to specify whether address checking should be performed for a memory-centric command. ACB bit values are specified in memory-centric instructions, automatically specified by a process, such as a compiler, or by host hardware, such as dispatch hardware, based upon whether a memory-centric command explicitly references memory. Implementations include bypassing, i.e., not performing, address checking for memory-centric commands that do not access memory and also for memory-centric commands that do access memory, but that have the same physical address as a prior memory-centric command that explicitly accessed memory to ensure that any data in caches was flushed to memory and/or invalidated.

119.

发明授权
Method and apparatus for reducing the latency of long latency memory requests 有权

公开(公告)号：US11960404B2

公开(公告)日：2024-04-16

申请号：US17029976

申请日：2020-09-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Jagadish B. Kotra , John Kalamatianos

IPC: G06F12/0875

CPC classification number: G06F12/0875 , G06F2212/452

Abstract: Systems, apparatuses, and methods for efficiently processing memory requests are disclosed. A computing system includes at least one processing unit coupled to a memory. Circuitry in the processing unit determines a memory request becomes a long-latency request based on detecting a translation lookaside buffer (TLB) miss, a branch misprediction, a memory dependence misprediction, or a precise exception has occurred. The circuitry marks the memory request as a long-latency request such as storing an indication of a long-latency request in an instruction tag of the memory request. The circuitry uses weighted criteria for scheduling out-of-order issue and servicing of memory requests. However, the indication of a long-latency request is not combined with other criteria in a weighted sum. Rather, the indication of the long-latency request is a separate value. The circuitry prioritizes memory requests marked as long-latency requests over memory requests not marked as long-latency requests.

120.

发明授权
Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host 有权

公开(公告)号：US11921634B2

公开(公告)日：2024-03-05

申请号：US17564155

申请日：2021-12-28

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Jagadish B. Kotra , John Kalamatianos , Yasuko Eckert , Yonghae Kim

IPC: G06F12/0811

CPC classification number: G06F12/0811

Abstract: Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host is disclosed. In an implementation, a memory controller identifies a first write instruction to write first data to a first memory location, where the first write instruction is not a processing-in-memory (PIM) instruction. The memory controller then writes the first data to a first PIM register. Opportunistically, the memory controller moves the first data from the first PIM register to the first memory location. In another implementation, a memory controller identifies a first memory location associated with a first read instruction, where the first read instruction is not a processing-in-memory (PIM) instruction. The memory controller identifies that a PIM register is associated with the first memory location. The memory controller then reads, in response to the first read instruction, first data from the PIM register.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification