专利检索 ap:("Gordon B. Bell" OR "Gordon T. Davis" OR "Jeffrey H. Derby" OR "Anil Krishna" OR "Srinivasan Ramani" OR "Ken Vu" OR "Steve Woolet") AND inv:"Srinivasan Ramani" 第 1 页

1.

发明授权
Prefetching with multiple processors and threads via a coherency bus 失效
标题翻译：通过一个一致性总线预取多个处理器和线程

公开(公告)号：US08543767B2

公开(公告)日：2013-09-24

申请号：US13488215

申请日：2012-06-04

申请人： Gordon B. Bell , Gordon T. Davis , Jeffrey H. Derby , Anil Krishna , Srinivasan Ramani , Ken Vu , Steve Woolet

发明人： Gordon B. Bell , Gordon T. Davis , Jeffrey H. Derby , Anil Krishna , Srinivasan Ramani , Ken Vu , Steve Woolet

IPC分类号： G06F13/00

CPC分类号： G06F12/0862 , G06F12/0831 , G06F2212/6026

摘要： A processing system includes a memory and a first core configured to process applications. The first core includes a first cache. The processing system includes a mechanism configured to capture a sequence of addresses of the application that miss the first cache in the first core and to place the sequence of addresses in a storage array; and a second core configured to process at least one software algorithm. The at least one software algorithm utilizes the sequence of addresses from the storage array to generate a sequence of prefetch addresses. The second core issues prefetch requests for the sequence of the prefetch addresses to the memory to obtain prefetched data and the prefetched data is provided to the first core if requested.

摘要翻译： 处理系统包括被配置为处理应用的存储器和第一核心。第一个核心包括第一个缓存。处理系统包括被配置为捕获错过第一核心中的第一高速缓存的应用程序的地址序列并将地址序列放置在存储阵列中的机制; 以及被配置为处理至少一个软件算法的第二核心。所述至少一个软件算法利用来自存储阵列的地址序列来生成预取地址序列。第二个核心将预取地址序列的预取请求发送到存储器以获得预取数据，并且如果请求，则将预取数据提供给第一核。

2.

发明授权
Effective prefetching with multiple processors and threads 失效
标题翻译：有效的预取与多个处理器和线程

公开(公告)号：US08200905B2

公开(公告)日：2012-06-12

申请号：US12192072

申请日：2008-08-14

申请人： Gordon Bernard Bell , Gordon Taylor Davis , Jeffrey Haskell Derby , Anil Krishna , Srinivasan Ramani , Ken Vu , Steve Woolet

发明人： Gordon Bernard Bell , Gordon Taylor Davis , Jeffrey Haskell Derby , Anil Krishna , Srinivasan Ramani , Ken Vu , Steve Woolet

IPC分类号： G06F13/00

CPC分类号： G06F12/0862 , G06F12/0831 , G06F2212/6026

摘要： A processing system includes a memory and a first core configured to process applications. The first core includes a first cache. The processing system includes a mechanism configured to capture a sequence of addresses of the application that miss the first cache in the first core and to place the sequence of addresses in a storage array; and a second core configured to process at least one software algorithm. The at least one software algorithm utilizes the sequence of addresses from the storage array to generate a sequence of prefetch addresses. The second core issues prefetch requests for the sequence of the prefetch addresses to the memory to obtain prefetched data and the prefetched data is provided to the first core if requested.

摘要翻译： 处理系统包括被配置为处理应用的存储器和第一核心。第一个核心包括第一个缓存。处理系统包括被配置为捕获错过第一核心中的第一高速缓存的应用程序的地址序列并将地址序列放置在存储阵列中的机制; 以及被配置为处理至少一个软件算法的第二核心。所述至少一个软件算法利用来自存储阵列的地址序列来生成预取地址序列。第二个核心将预取地址序列的预取请求发送到存储器以获得预取数据，并且如果请求，则将预取数据提供给第一核。

3.

发明申请
EFFECTIVE PREFETCHING WITH MULTIPLE PROCESSORS AND THREADS 失效
标题翻译：有效的预处理与多个处理器和螺纹

公开(公告)号：US20120246406A1

公开(公告)日：2012-09-27

申请号：US13488215

申请日：2012-06-04

申请人： Gordon Bernard Bell , Gordon Taylor Davis , Jeffrey Haskell Derby , Anil Krishna , Srinivasan Ramani , Ken Vu , Steve Woolet

发明人： Gordon Bernard Bell , Gordon Taylor Davis , Jeffrey Haskell Derby , Anil Krishna , Srinivasan Ramani , Ken Vu , Steve Woolet

IPC分类号： G06F12/08

CPC分类号： G06F12/0862 , G06F12/0831 , G06F2212/6026

摘要： A processing system includes a memory and a first core configured to process applications. The first core includes a first cache. The processing system includes a mechanism configured to capture a sequence of addresses of the application that miss the first cache in the first core and to place the sequence of addresses in a storage array; and a second core configured to process at least one software algorithm. The at least one software algorithm utilizes the sequence of addresses from the storage array to generate a sequence of prefetch addresses. The second core issues prefetch requests for the sequence of the prefetch addresses to the memory to obtain prefetched data and the prefetched data is provided to the first core if requested.

摘要翻译： 处理系统包括被配置为处理应用的存储器和第一核心。第一个核心包括第一个缓存。处理系统包括被配置为捕获错过第一核心中的第一高速缓存的应用程序的地址序列并将地址序列放置在存储阵列中的机制; 以及被配置为处理至少一个软件算法的第二核心。所述至少一个软件算法利用来自存储阵列的地址序列来生成预取地址序列。第二个核心将预取地址序列的预取请求发送到存储器以获得预取数据，并且如果请求，则将预取数据提供给第一核。

4.

发明申请
DATA REORGANIZATION IN NON-UNIFORM CACHE ACCESS CACHES 有权
标题翻译：非均匀缓存访问缓存中的数据重组

公开(公告)号：US20100274973A1

公开(公告)日：2010-10-28

申请号：US12429754

申请日：2009-04-24

申请人： Ganesh Balakrishnan , Gordon B. Bell , Anil Krishna , Srinivasan Ramani

发明人： Ganesh Balakrishnan , Gordon B. Bell , Anil Krishna , Srinivasan Ramani

IPC分类号： G06F12/08 , G06F12/00

CPC分类号： G06F12/0846 , G06F12/0811

摘要： Embodiments that dynamically reorganize data of cache lines in non-uniform cache access (NUCA) caches are contemplated. Various embodiments comprise a computing device, having one or more processors coupled with one or more NUCA cache elements. The NUCA cache elements may comprise one or more banks of cache memory, wherein ways of the cache are horizontally distributed across multiple banks. To improve access latency of the data by the processors, the computing devices may dynamically propagate cache lines into banks closer to the processors using the cache lines. To accomplish such dynamic reorganization, embodiments may maintain “direction” bits for cache lines. The direction bits may indicate to which processor the data should be moved. Further, embodiments may use the direction bits to make cache line movement decisions.

摘要翻译： 预期在非均匀缓存访问（NUCA）高速缓存中动态地重组高速缓存线的数据的实施例。各种实施例包括具有与一个或多个NUCA高速缓存元件耦合的一个或多个处理器的计算设备。 NUCA高速缓存元件可以包括一个或多个高速缓冲存储器组，其中高速缓存的方式在多个存储体之间水平分布。为了改善处理器对数据的访问等待时间，计算设备可以使用高速缓存行来将缓存线路动态地传播到更靠近处理器的存储体中。为了实现这种动态重组，实施例可以保持高速缓存行的“方向”位。方向位可以指示哪个处理器应该移动数据。此外，实施例可以使用方向位来进行高速缓存行移动决定。

5.

发明授权
Data reorganization in non-uniform cache access caches 有权
标题翻译：非均匀缓存访问缓存中的数据重组

公开(公告)号：US08140758B2

公开(公告)日：2012-03-20

申请号：US12429754

申请日：2009-04-24

申请人： Ganesh Balakrishnan , Gordon B. Bell , Anil Krishna , Srinivasan Ramani

发明人： Ganesh Balakrishnan , Gordon B. Bell , Anil Krishna , Srinivasan Ramani

IPC分类号： G06F15/163

CPC分类号： G06F12/0846 , G06F12/0811

摘要： Embodiments that dynamically reorganize data of cache lines in non-uniform cache access (NUCA) caches are contemplated. Various embodiments comprise a computing device, having one or more processors coupled with one or more NUCA cache elements. The NUCA cache elements may comprise one or more banks of cache memory, wherein ways of the cache are horizontally distributed across multiple banks. To improve access latency of the data by the processors, the computing devices may dynamically propagate cache lines into banks closer to the processors using the cache lines. To accomplish such dynamic reorganization, embodiments may maintain “direction” bits for cache lines. The direction bits may indicate to which processor the data should be moved. Further, embodiments may use the direction bits to make cache line movement decisions.

摘要翻译： 预期在非均匀缓存访问（NUCA）高速缓存中动态地重组高速缓存线的数据的实施例。各种实施例包括具有与一个或多个NUCA高速缓存元件耦合的一个或多个处理器的计算设备。 NUCA高速缓存元件可以包括一个或多个高速缓冲存储器组，其中高速缓存的方式在多个存储体之间水平分布。为了改善处理器对数据的访问等待时间，计算设备可以使用高速缓存行来将缓存线路动态地传播到更靠近处理器的存储体中。为了实现这种动态重组，实施例可以保持高速缓存行的“方向”位。方向位可以指示哪个处理器应该移动数据。此外，实施例可以使用方向位来进行高速缓存行移动决定。

6.

发明授权
Active flow management with hysteresis 失效
标题翻译：主动流量管理带滞后

公开(公告)号：US07453798B2

公开(公告)日：2008-11-18

申请号：US10782617

申请日：2004-02-19

申请人： Jeffrey P. Bradford , Gordon T. Davis , Dongming Hwang , Clark D. Jeffries , Srinivasan Ramani , Kartik Sudeep , Ken V. Vu

发明人： Jeffrey P. Bradford , Gordon T. Davis , Dongming Hwang , Clark D. Jeffries , Srinivasan Ramani , Kartik Sudeep , Ken V. Vu

IPC分类号： H04L12/00

CPC分类号： H04L47/10 , H04L47/29 , H04L47/323

摘要： The present invention provides for a computer network method and system that applies “hysteresis” to an active queue management algorithm. If a queue is at a level below a certain low threshold and a burst of packets arrives at a network node, then the probability of dropping the initial packets in the burst is recalculated, but the packets are not dropped. However, if the queue level crosses beyond a hysteresis threshold, then packets are discarded pursuant to a drop probability.Also, according to the present invention, queue level may be decreased until it becomes less than the hysteresis threshold, with packets dropped per the drop probability until the queue level decreases to at least a low threshold. In one embodiment, an adaptive algorithm is also provided to adjust the transmit probability for each flow together with hysteresis to increase the packet transmit rates to absorb bursty traffic.

摘要翻译： 本发明提供一种向活动队列管理算法应用“迟滞”的计算机网络方法和系统。如果队列处于低于某个低阈值的水平，并且一群数据包到达网络节点，则重新计算突发中丢弃初始数据包的概率，但不会丢弃数据包。然而，如果队列级别超过滞后阈值，则根据丢弃概率丢弃数据包。此外，根据本发明，可以减少队列级别，直到其变得小于滞后阈值，其中每个丢弃概率的分组丢弃，直到队列级别降低到至少低阈值。在一个实施例中，还提供自适应算法来调整每个流的发送概率以及迟滞以增加分组传输速率以吸收突发业务。

7.

发明授权
Store-to-load forwarding mechanism for processor runahead mode operation 失效
标题翻译：存储到负载转发机制，用于处理器跑头模式操作

公开(公告)号：US08639886B2

公开(公告)日：2014-01-28

申请号：US12364984

申请日：2009-02-03

申请人： Gordon Bell , Anil Krishna , Srinivasan Ramani

发明人： Gordon Bell , Anil Krishna , Srinivasan Ramani

IPC分类号： G06F12/08

CPC分类号： G06F12/0875 , G06F9/3826 , G06F9/3834 , G06F9/3857

摘要： A system and method to optimize runahead operation for a processor without use of a separate explicit runahead cache structure. Rather than simply dropping store instructions in a processor runahead mode, store instructions write their results in an existing processor store queue, although store instructions are not allowed to update processor caches and system memory. Use of the store queue during runahead mode to hold store instruction results allows more recent runahead load instructions to search retired store queue entries in the store queue for matching addresses to utilize data from the retired, but still searchable, store instructions. Retired store instructions could be either runahead store instructions retired, or retired store instructions that executed before entering runahead mode.

摘要翻译： 一种用于在不使用单独的显式跑道缓存结构的情况下优化处理器的跑步头操作的系统和方法。尽管存储指令不允许更新处理器缓存和系统存储器，但存储指令将其结果写入现有的处理器存储队列中，而不是简单地将存储指令放在处理器跑头模式中。在跑步模式期间使用存储队列来保存存储指令结果允许更多的最新跑步加载指令来搜索存储队列中的退出存储队列条目以匹配地址以利用来自已退休但仍可搜索的存储指令的数据。退休存储指令可以是退出存储指令退出，或退出存储指令，在进入排头模式之前执行。

8.

发明申请
STORE-TO-LOAD FORWARDING MECHANISM FOR PROCESSOR RUNAHEAD MODE OPERATION 失效
标题翻译：用于处理器RUNAHEAD模式操作的存储加载转发机制

公开(公告)号：US20100199045A1

公开(公告)日：2010-08-05

申请号：US12364984

申请日：2009-02-03

申请人： Gordon Bell , Anil Krishna , Srinivasan Ramani

发明人： Gordon Bell , Anil Krishna , Srinivasan Ramani

IPC分类号： G06F12/08 , G06F9/312

CPC分类号： G06F12/0875 , G06F9/3826 , G06F9/3834 , G06F9/3857

摘要： A system and method to optimize runahead operation for a processor without use of a separate explicit runahead cache structure. Rather than simply dropping store instructions in a processor runahead mode, store instructions write their results in an existing processor store queue, although store instructions are not allowed to update processor caches and system memory. Use of the store queue during runahead mode to hold store instruction results allows more recent runahead load instructions to search retired store queue entries in the store queue for matching addresses to utilize data from the retired, but still searchable, store instructions. Retired store instructions could be either runahead store instructions retired, or retired store instructions that executed before entering runahead mode.

摘要翻译： 一种用于在不使用单独的显式跑道缓存结构的情况下优化处理器的跑步头操作的系统和方法。尽管存储指令不允许更新处理器缓存和系统存储器，但存储指令将其结果写入现有的处理器存储队列中，而不是简单地将存储指令放在处理器跑头模式中。在跑步模式期间使用存储队列来保存存储指令结果允许更多的最新跑步加载指令来搜索存储队列中的退出存储队列条目以匹配地址以利用来自已退休但仍可搜索的存储指令的数据。退休存储指令可以是退出存储指令退出，或退出存储指令，在进入排头模式之前执行。

9.

发明授权
Architectural level throughput based power modeling methodology and apparatus for pervasively clock-gated processor cores 有权
标题翻译：基于建筑级吞吐量的功率建模方法和设备，用于普及时钟门控处理器内核

公开(公告)号：US07818696B2

公开(公告)日：2010-10-19

申请号：US11780712

申请日：2007-07-20

申请人： Pradip Bose , Tejas S. Karkhanis , Srinivasan Ramani , Malcolm Scott Ware , Ken Vu

发明人： Pradip Bose , Tejas S. Karkhanis , Srinivasan Ramani , Malcolm Scott Ware , Ken Vu

IPC分类号： G06F17/50

CPC分类号： G06F17/5022 , G06F2217/78

摘要： A method for estimating power dissipated by a processor core processing a workload-includes analyzing a reference test case to generate a reference workload characteristic, analyzing an actual workload to generate an actual workload characteristic, performing a power analysis for the reference test case to establish a reference power dissipation value and estimating an actual workload power dissipation value responsive to the actual and reference workload characteristics and the reference power dissipation value.

摘要翻译： 一种用于估计由处理器核心处理工作负载的功率消耗的方法 - 包括分析参考测试用例以生成参考工作负载特征，分析实际工作负载以产生实际工作负载特性，对参考测试用例执行功率分析以建立参考功耗值，并根据实际和参考工作负载特性和参考功耗值估计实际工作负载功耗值。

10.

发明授权
Architectural level throughput based power modeling methodology and apparatus for pervasively clock-gated processor cores 有权
标题翻译：基于建筑级吞吐量的功率建模方法和设备，用于普及时钟门控处理器内核

公开(公告)号：US07249331B2

公开(公告)日：2007-07-24

申请号：US10960730

申请日：2004-10-07

申请人： Pradip Bose , Tejas S. Karkhanis , Srinivasan Ramani , Malcolm Scott Ware , Ken Vu

发明人： Pradip Bose , Tejas S. Karkhanis , Srinivasan Ramani , Malcolm Scott Ware , Ken Vu

IPC分类号： G06F17/50

CPC分类号： G06F17/5022 , G06F2217/78

摘要： A method for estimating power dissipated by processor core processing a workload includes analyzing a reference test case to generate a reference workload characteristic, analyzing an actual workload to generate an actual workload characteristic, performing a power analysis for the reference test case to establish a reference power dissipation value and estimating an actual workload power dissipation value responsive to the actual and reference workload characteristics and the reference power dissipation value.

摘要翻译： 用于估计由处理器核心处理工作负载消耗的功率的方法包括分析参考测试用例以生成参考工作负载特性，分析实际工作负载以产生实际工作负载特性，对参考测试用例执行功率分析以建立参考功率消耗值，并根据实际和参考工作负载特性以及参考功耗值估计实际工作负载功耗值。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类