专利检索 ap:"Aamer Jaleel" 第 1 页

1.

发明申请
HARDWARE/SOFTWARE CO-OPTIMIZATION TO IMPROVE PERFORMANCE AND ENERGY FOR INTER-VM COMMUNICATION FOR NFVS AND OTHER PRODUCER-CONSUMER WORKLOADS 有权

公开(公告)号：US20210004328A1

公开(公告)日：2021-01-07

申请号：US17027248

申请日：2020-09-21

申请人： Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran

发明人： Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran

IPC分类号： G06F12/0842 , G06F12/0831 , G06F12/0893 , G06F12/109 , G06F12/0813 , G06F9/455

摘要： Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.

2.

发明申请
Instruction and Logic for Run-time Evaluation of Multiple Prefetchers 有权
标题翻译：多个预取器运行时评估的指令和逻辑

公开(公告)号：US20150234663A1

公开(公告)日：2015-08-20

申请号：US14181032

申请日：2014-02-14

申请人： Zeshan A. Chishti , Christopher B. Wilkerson , Seth Pugsley , Peng-Fei Chuang , Robert L. Scott , Aamer Jaleel , Shih-Lien L. Lu , Kingsum Chow

发明人： Zeshan A. Chishti , Christopher B. Wilkerson , Seth Pugsley , Peng-Fei Chuang , Robert L. Scott , Aamer Jaleel , Shih-Lien L. Lu , Kingsum Chow

IPC分类号： G06F9/38 , G06F12/08

CPC分类号： G06F12/0862 , G06F9/3802 , G06F9/3808 , G06F9/383 , G06F2212/6026

摘要： A processor includes a cache, a prefetcher module to select information according to a prefetcher algorithm, and a prefetcher algorithm selection module. The prefetcher algorithm selection module includes logic to select a candidate prefetcher algorithm determine and store memory addresses of predicted memory accesses of the candidate prefetcher algorithm when performed by the prefetcher module, determine cache lines accessed during memory operations, and evaluate whether the determined cache lines match the stored memory addresses. The prefetcher algorithm selection module further includes logic to adjust an accuracy ratio of the candidate prefetcher algorithm, compare the accuracy ratio with a threshold accuracy ratio, and determine whether to apply the first candidate prefetcher algorithm to the prefetcher module.

摘要翻译： 处理器包括高速缓存，根据预取器算法选择信息的预取器模块以及预取器算法选择模块。预取器算法选择模块包括选择候选预取器算法的逻辑，当由预取器模块执行时，确定并存储候选预取器算法的预测存储器访问的存储器地址，确定在存储器操作期间访问的高速缓存行，并且评估所确定的高速缓存行是否匹配存储的存储器地址。预取器算法选择模块还包括用于调整候选预取器算法的准确率的逻辑，将精度比与阈值精度比进行比较，并且确定是否将第一候选预取器算法应用于预取器模块。

3.

发明申请
DISTRIBUTED MEMORY OPERATIONS 审中-公开
标题翻译：分布式存储器操作

公开(公告)号：US20150089162A1

公开(公告)日：2015-03-26

申请号：US14037468

申请日：2013-09-26

申请人： Bushra Ahsan , Michael C. Adler , Neal C. Crago , Joel S. Emer , Aamer Jaleel , Angshuman Parashar , Michael I. Pellauer

发明人： Bushra Ahsan , Michael C. Adler , Neal C. Crago , Joel S. Emer , Aamer Jaleel , Angshuman Parashar , Michael I. Pellauer

IPC分类号： G06F3/06

CPC分类号： G06F13/1663 , G06F12/0646 , G06F13/1652 , G06F13/1684 , G06F2213/00

摘要： A technology for implementing a method for distributed memory operations. A method of the disclosure includes obtaining distributed channel information for an algorithm to be executed by a plurality of spatially distributed processing elements. For each distributed channel in the distributed channel information, the method further associates one or more of the plurality of spatially distributed processing elements with the distributed channel based on the algorithm.

摘要翻译： 一种实现分布式存储器操作方法的技术。本公开的方法包括获得要由多个空间分布的处理元件执行的算法的分布式信道信息。对于分布式信道信息中的每个分布式信道，该方法还基于该算法进一步将多个空间分布处理单元中的一个或多个与分布式信道相关联。

4.

发明授权
Thread scheduling on multiprocessor systems 有权
标题翻译：多处理器系统上的线程调度

公开(公告)号：US08839259B2

公开(公告)日：2014-09-16

申请号：US13355611

申请日：2012-01-23

申请人： Wenlong Li , Tao Wang , Aamer Jaleel , Yimin Zhang

发明人： Wenlong Li , Tao Wang , Aamer Jaleel , Yimin Zhang

IPC分类号： G06F9/46 , G06F9/38 , G06F9/50

CPC分类号： G06F9/5044 , G06F9/3836 , G06F9/3851 , G06F9/3885 , G06F9/3891

摘要： A thread scheduler may be used in a chip multiprocessor or symmetric multiprocessor system to schedule threads to processors. The scheduler may determine the bandwidth utilization of the two threads in combination and whether that utilization exceeds the threshold value. If so, the threads may be scheduled on different processor clusters that do not have the same paths between the common memory and the processors. If not, then the threads may be allocated on the same processor cluster that shares cache among processors.

摘要翻译： 线程调度器可以用在芯片多处理器或对称多处理器系统中以将线程调度到处理器。调度器可以组合确定两个线程的带宽利用率以及该利用率是否超过阈值。如果是这样，线程可能被调度在公共存储器和处理器之间没有相同路径的不同处理器集群上。如果没有，那么线程可能会分配在处理器之间共享高速缓存的同一个处理器集群上。

5.

发明授权
Cache spill management techniques using cache spill prediction 失效
标题翻译：缓存溢出管理技术使用缓存溢出预测

公开(公告)号：US08407421B2

公开(公告)日：2013-03-26

申请号：US12639214

申请日：2009-12-16

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Aamer Jaleel , George Z. Chrysos

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Aamer Jaleel , George Z. Chrysos

IPC分类号： G06F12/00

CPC分类号： G06F12/0806 , G06F12/12

摘要： An apparatus and method is described herein for intelligently spilling cache lines. Usefulness of cache lines previously spilled from a source cache is learned, such that later evictions of useful cache lines from a source cache are intelligently selected for spill. Furthermore, another learning mechanism—cache spill prediction—may be implemented separately or in conjunction with usefulness prediction. The cache spill prediction is capable of learning the effectiveness of remote caches at holding spilled cache lines for the source cache. As a result, cache lines are capable of being intelligently selected for spill and intelligently distributed among remote caches based on the effectiveness of each remote cache in holding spilled cache lines for the source cache.

摘要翻译： 这里描述了用于智能地溢出高速缓存行的装置和方法。了解先前从源缓存溢出的高速缓存行的有用性，从而智能地选择来自源缓存的随后驱逐的溢出。此外，另一种学习机制 - 缓存溢出预测 - 可以单独实施或结合有用性预测来实现。高速缓存溢出预测能够学习在为源缓存保留溢出的高速缓存行时远程高速缓存的有效性。因此，基于每个远程高速缓存在保存用于源高速缓存的溢出高速缓存行的有效性的情况下，高速缓存行能够被智能地选择为溢出并且智能地分布在远程高速缓存中。

6.

发明申请
CACHE SPILL MANAGEMENT TECHNIQUES 失效
标题翻译：缓存溢出管理技术

公开(公告)号：US20110145501A1

公开(公告)日：2011-06-16

申请号：US12639214

申请日：2009-12-16

申请人： Simon C. Steely, JR. , William C. Hasenplaugh , Aamer Jaleel , George Z. Chrysos

发明人： Simon C. Steely, JR. , William C. Hasenplaugh , Aamer Jaleel , George Z. Chrysos

IPC分类号： G06F12/08 , G06F12/00

CPC分类号： G06F12/0806 , G06F12/12

摘要： An apparatus and method is described herein for intelligently spilling cache lines. Usefulness of cache lines previously spilled from a source cache is learned, such that later evictions of useful cache lines from a source cache are intelligently selected for spill. Furthermore, another learning mechanism—cache spill prediction—may be implemented separately or in conjunction with usefulness prediction. The cache spill prediction is capable of learning the effectiveness of remote caches at holding spilled cache lines for the source cache. As a result, cache lines are capable of being intelligently selected for spill and intelligently distributed among remote caches based on the effectiveness of each remote cache in holding spilled cache lines for the source cache.

摘要翻译： 这里描述了用于智能地溢出高速缓存行的装置和方法。了解先前从源缓存溢出的高速缓存行的有用性，从而智能地选择来自源缓存的随后驱逐的溢出。此外，另一种学习机制 - 缓存溢出预测 - 可以单独实施或结合有用性预测来实现。高速缓存溢出预测能够学习在为源缓存保留溢出的高速缓存行时远程高速缓存的有效性。因此，基于每个远程高速缓存在保存用于源高速缓存的溢出高速缓存行的有效性的情况下，高速缓存行能够被智能地选择为溢出并且智能地分布在远程高速缓存中。

7.

发明申请
Thread scheduling on multiprocessor systems 有权
标题翻译：多处理器系统上的线程调度

公开(公告)号：US20080244587A1

公开(公告)日：2008-10-02

申请号：US11728350

申请日：2007-03-26

申请人： Wenlong Li , Tao Wang , Aamer Jaleel , Yimin Zhang

发明人： Wenlong Li , Tao Wang , Aamer Jaleel , Yimin Zhang

IPC分类号： G06F9/30 , G06F9/46

CPC分类号： G06F9/5044 , G06F9/3836 , G06F9/3851 , G06F9/3885 , G06F9/3891

摘要： A thread scheduler may be used in a chip multiprocessor or symmetric multiprocessor system to schedule threads to processors. The scheduler may determine the bandwidth utilization of the two threads in combination and whether that utilization exceeds the threshold value. If so, the threads may be scheduled on different processor clusters that do not have the same paths between the common memory and the processors. If not, then the threads may be allocated on the same processor cluster that shares cache among processors.

摘要翻译： 线程调度器可以用在芯片多处理器或对称多处理器系统中以将线程调度到处理器。调度器可以组合确定两个线程的带宽利用率以及该利用率是否超过阈值。如果是这样，线程可能被调度在公共存储器和处理器之间没有相同路径的不同处理器集群上。如果没有，那么线程可能会分配在处理器之间共享高速缓存的同一个处理器集群上。

8.

发明申请
MANAGING SHARED CACHE BY MULTI-CORE PROCESSOR 有权
标题翻译：通过多核处理器管理共享缓存

公开(公告)号：US20150067259A1

公开(公告)日：2015-03-05

申请号：US14013220

申请日：2013-08-29

申请人： Ren Wang , Kevin B. Theobald , Zeshan A. Chishti , Zhaojuan Bian , Aamer Jaleel , Tsung-Yuan C. Tai

发明人： Ren Wang , Kevin B. Theobald , Zeshan A. Chishti , Zhaojuan Bian , Aamer Jaleel , Tsung-Yuan C. Tai

IPC分类号： G06F12/08

CPC分类号： G06F12/0811 , G06F12/0895 , G06F2212/1028 , G06F2212/601 , Y02D10/13

摘要： Systems and methods for managing shared cache by multi-core processor. An example processing system comprises: a plurality of processing cores, each processing core communicatively coupled to a last level cache (LLC) slice; and a cache control logic coupled to the plurality of processing cores, the cache control logic configured to perform one of: making an LLC slice of an inactive processing core available to an active processing core or power gating the LLC slice, based on estimating cache requirements by active processing cores.

摘要翻译： 通过多核处理器管理共享缓存的系统和方法。一个示例处理系统包括：多个处理核心，每个处理核心通信地耦合到最后一级高速缓存（LLC）片; 以及耦合到所述多个处理核心的高速缓存控制逻辑，所述高速缓存控制逻辑被配置为执行下列之一：基于估计高速缓存需求，使非活动处理核心的LLC片可用于活动处理核心或门控所述LLC片段通过主动处理核心。

9.

发明申请
Processor Scheduling With Thread Performance Estimation On Core Of Different Type 有权
标题翻译：处理器调度与不同类型的核心线程性能估计

公开(公告)号：US20140282565A1

公开(公告)日：2014-09-18

申请号：US13843496

申请日：2013-03-15

申请人： AAMER JALEEL , KENZO VAN CRAEYNEST , PAOLO NARVAEZ , JOEL EMER

发明人： AAMER JALEEL , KENZO VAN CRAEYNEST , PAOLO NARVAEZ , JOEL EMER

IPC分类号： G06F9/46

CPC分类号： G06F9/5044 , G06F9/3836 , G06F9/3851 , G06F9/4881 , G06F11/30 , G06F11/3404 , G06F11/3419 , G06F11/3433 , G06F11/3452

摘要： A processor is described having an out-of-order core to execute a first thread and a non-out-of-order core to execute a second thread. The processor also includes statistics collection circuitry to support calculation of the following: the first thread's performance on the out-of-order core; an estimate of the first thread's performance on the non-out-of-order core; the second thread's performance on the non-out-of-order core; an estimate of the second thread's performance on the out-of-order core.

摘要翻译： 描述了一种具有无序核心以执行第一线程和非无序核心以执行第二线程的处理器。处理器还包括统计信息收集电路，以支持以下计算：第一线程在无序核心上的性能; 对非线程核心的第一线程性能的估计; 第二线程在非无序核心上的表现; 第二线程在无序核心上的性能的估计。

10.

发明申请
THREAD SCHEDULING ON MULTIPROCESSOR SYSTEMS 有权
标题翻译：多处理器系统的线程调度

公开(公告)号：US20120124587A1

公开(公告)日：2012-05-17

申请号：US13355611

申请日：2012-01-23

申请人： Wenlong Li , Tao Wang , Aamer Jaleel , Yimin Zhang

发明人： Wenlong Li , Tao Wang , Aamer Jaleel , Yimin Zhang

IPC分类号： G06F9/46 , G06F17/50

CPC分类号： G06F9/5044 , G06F9/3836 , G06F9/3851 , G06F9/3885 , G06F9/3891

摘要： A thread scheduler may be used in a chip multiprocessor or symmetric multiprocessor system to schedule threads to processors. The scheduler may determine the bandwidth utilization of the two threads in combination and whether that utilization exceeds the threshold value. If so, the threads may be scheduled on different processor clusters that do not have the same paths between the common memory and the processors. If not, then the threads may be allocated on the same processor cluster that shares cache among processors.

摘要翻译： 线程调度器可以用在芯片多处理器或对称多处理器系统中以将线程调度到处理器。调度器可以组合确定两个线程的带宽利用率以及该利用率是否超过阈值。如果是这样，线程可能被调度在公共存储器和处理器之间没有相同路径的不同处理器集群上。如果没有，那么线程可能会分配在处理器之间共享高速缓存的同一个处理器集群上。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类