专利检索 ap:("Patrick Joseph Bohrer" OR "Orran Yaakov Krieger" OR "Ramakrishnan Rajamony" OR "Michael Rosenfield" OR "Hazim Shafi" OR "Balaram Sinharoy" OR "Robert Brett Tremaine") AND inv:"Balaram Sinharoy" 第 2 页

11.

发明授权
Remote asynchronous data mover 失效
标题翻译：远程异步数据移动器

公开(公告)号：US07996564B2

公开(公告)日：2011-08-09

申请号：US12425093

申请日：2009-04-16

申请人： Lakshminarayana B. Arimilli , Ravi K. Arimilli , Ronald N. Kalla , Ramakrishnan Rajamony , Balaram Sinharoy , William E. Speight , William J. Starke

发明人： Lakshminarayana B. Arimilli , Ravi K. Arimilli , Ronald N. Kalla , Ramakrishnan Rajamony , Balaram Sinharoy , William E. Speight , William J. Starke

IPC分类号： G06F12/00

CPC分类号： G06F9/54 , G06F12/10 , G06F12/1081

摘要： A distributed data processing system executes multiple tasks within a parallel job, including a first local task on a local node and at least one task executing on a remote node, with a remote memory having real address (RA) locations mapped to one or more of the source effective addresses (EA) and destination EA of a data move operation initiated by a task executing on the local node. On initiation of the data move operation, remote asynchronous data move (RADM) logic identifies that the operation moves data to/from a first EA that is memory mapped to an RA of the remote memory. The local processor/RADM logic initiates a RADM operation that moves a copy of the data directly from/to the first remote memory by completing the RADM operation using the network interface cards (NICs) of the source and destination processing nodes, determined by accessing a data center for the node IDs of remote memory.

摘要翻译： 分布式数据处理系统在并行作业中执行多个任务，包括本地节点上的第一本地任务和在远程节点上执行的至少一个任务，具有映射到以下的一个或多个的实地址（RA）位置的远程存储器由本地节点上执行的任务启动的数据移动操作的源有效地址（EA）和目标EA。在启动数据移动操作时，远程异步数据移动（RADM）逻辑识别该操作将数据移动到/从第一个EA，该第一个EA是映射到远程存储器的RA的存储器。本地处理器/ RADM逻辑启动RADM操作，其通过使用源和目的地处理节点的网络接口卡（NIC）完成RADM操作，直接从/向第一远程存储器移动数据的副本，其通过访问数据中心为远程存储器的节点ID。

12.

发明申请
Remote Asynchronous Data Mover 失效
标题翻译：远程异步数据移动器

公开(公告)号：US20100268788A1

公开(公告)日：2010-10-21

申请号：US12425093

申请日：2009-04-16

申请人： Lakshminarayana B. Arimilli , Ravi K. Arimilli , Ronald N. Kalla , Ramakrishnan Rajamony , Balaram Sinharoy , William E. Speight , William J. Starke

发明人： Lakshminarayana B. Arimilli , Ravi K. Arimilli , Ronald N. Kalla , Ramakrishnan Rajamony , Balaram Sinharoy , William E. Speight , William J. Starke

IPC分类号： G06F15/167 , G06F9/46 , G06F12/00 , G06F12/10

CPC分类号： G06F9/54 , G06F12/10 , G06F12/1081

摘要： A distributed data processing system executes multiple tasks within a parallel job, including a first local task on a local node and at least one task executing on a remote node, with a remote memory having real address (RA) locations mapped to one or more of the source effective addresses (EA) and destination EA of a data move operation initiated by a task executing on the local node. On initiation of the data move operation, remote asynchronous data move (RADM) logic identifies that the operation moves data to/from a first EA that is memory mapped to an RA of the remote memory. The local processor/RADM logic initiates a RADM operation that moves a copy of the data directly from/to the first remote memory by completing the RADM operation using the network interface cards (NICs) of the source and destination processing nodes, determined by accessing a data center for the node IDs of remote memory.

摘要翻译： 分布式数据处理系统在并行作业中执行多个任务，包括本地节点上的第一本地任务和在远程节点上执行的至少一个任务，具有映射到以下的一个或多个的实地址（RA）位置的远程存储器由本地节点上执行的任务启动的数据移动操作的源有效地址（EA）和目标EA。在启动数据移动操作时，远程异步数据移动（RADM）逻辑识别该操作将数据移动到/从第一个EA，该第一个EA是映射到远程存储器的RA的存储器。本地处理器/ RADM逻辑启动RADM操作，其通过使用源和目的地处理节点的网络接口卡（NIC）完成RADM操作，直接从/向第一远程存储器移动数据的副本，其通过访问数据中心为远程存储器的节点ID。

13.

发明申请
Data replication in multiprocessor NUCA systems to reduce horizontal cache thrashing 失效
标题翻译：多处理器NUCA系统中的数据复制，以减少水平缓存的颠簸

公开(公告)号：US20060080506A1

公开(公告)日：2006-04-13

申请号：US10960611

申请日：2004-10-07

申请人： Ramakrishnan Rajamony , Xiaowei Shen , Balaram Sinharoy

发明人： Ramakrishnan Rajamony , Xiaowei Shen , Balaram Sinharoy

IPC分类号： G06F12/00

CPC分类号： G06F12/0846 , G06F12/0813 , G06F12/084 , G06F12/122 , G06F2212/271

摘要： A method of managing a distributed cache structure having separate cache banks, by detecting that a given cache line has been repeatedly accessed by two or more processors which share the cache, and replicating that cache line in at least two separate cache banks. The cache line is optimally replicated in a cache bank having the lowest latency with respect to the given accessing processor. A currently accessed line in a different cache bank can be exchanged with a cache line in the cache bank with the lowest latency, and another line in the cache bank with lowest latency is moved to the different cache bank prior to the currently accessed line being moved to the cache bank with the lowest latency. Further replication of the cache line can be disabled when two or more processors alternately write to the cache line.

摘要翻译： 一种通过检测给定的高速缓存行已被共享高速缓存的两个或多个处理器重复访问并且在至少两个单独的高速缓冲存储器中复制该高速缓存行的方式来管理具有单独的高速缓存组的分布式高速缓存结构的方法。高速缓存行被优化地复制到相对于给定访问处理器具有最低延迟的缓存组中。在不同的缓存组中的当前访问的行可以与具有最低延迟的高速缓存组中的高速缓存行交换，并且具有最低延迟的高速缓存组中的另一行在当前访问的行被移动之前移动到不同的高速缓存组以最低的延迟到达缓存库。当两个或更多个处理器交替写入高速缓存行时，可以禁用高速缓存行的进一步复制。

14.

发明授权
Specifying an access hint for prefetching partial cache block data in a cache hierarchy 失效
标题翻译：指定用于在缓存层次结构中预取部分高速缓存块数据的访问提示

公开(公告)号：US08140759B2

公开(公告)日：2012-03-20

申请号：US12424716

申请日：2009-04-16

申请人： Bradly George Frey , Guy Lynn Guthrie , Cathy May , Ramakrishnan Rajamony , Balaram Sinharoy , William John Starke , Peter Kenneth Szwed

发明人： Bradly George Frey , Guy Lynn Guthrie , Cathy May , Ramakrishnan Rajamony , Balaram Sinharoy , William John Starke , Peter Kenneth Szwed

IPC分类号： G06F13/00

CPC分类号： G06F12/0862 , G06F12/0811 , G06F12/0817 , G06F2212/6028

摘要： A system and method for specifying an access hint for prefetching only a subsection of cache block data, for more efficient system interconnect usage by the processor core. A processing unit receives a data cache block touch (DCBT) instruction containing an access hint and identifying a specific size portion of data to be prefetched. Both the access hint and a value corresponding to an amount of data to be prefetched are contained in separate subfields of the DCBT instruction. In response to detecting that the code point is set to a specific value, only the specific size of data identified in a sub-field of the DCBT and addressed in the DCBT instruction is prefetched into an entry in the lower level cache.

摘要翻译： 用于指定用于仅预取高速缓存块数据的子部分的访问提示的系统和方法，用于处理器核心的更有效的系统互连使用。处理单元接收包含访问提示的数据高速缓存块触摸（DCBT）指令，并且识别要预取的数据的特定大小部分。访问提示和对应于要预取的数据量的值都包含在DCBT指令的单独子字段中。响应于检测到代码点被设置为特定值，仅在DCBT指令的DCBT的子字段中标识的数据的特定大小被预取到低级缓存中的条目中。

15.

发明申请
SPECIFYING AN ACCESS HINT FOR PREFETCHING PARTIAL CACHE BLOCK DATA IN A CACHE HIERARCHY 失效
标题翻译：指定访问提示用于缓存高速缓存中的部分缓存块数据

公开(公告)号：US20100268886A1

公开(公告)日：2010-10-21

申请号：US12424716

申请日：2009-04-16

申请人： Bradly George Frey , Guy Lynn Guthrie , Cathy May , Ramakrishnan Rajamony , Balaram Sinharoy , William John Starke , Peter Kenneth Szwed

发明人： Bradly George Frey , Guy Lynn Guthrie , Cathy May , Ramakrishnan Rajamony , Balaram Sinharoy , William John Starke , Peter Kenneth Szwed

IPC分类号： G06F12/08 , G06F12/00

CPC分类号： G06F12/0862 , G06F12/0811 , G06F12/0817 , G06F2212/6028

摘要： A system and method for specifying an access hint for prefetching only a subsection of cache block data, for more efficient system interconnect usage by the processor core. A processing unit receives a data cache block touch (DCBT) instruction containing an access hint and identifying a specific size portion of data to be prefetched. Both the access hint and a value corresponding to an amount of data to be prefetched are contained in separate subfields of the DCBT instruction. In response to detecting that the code point is set to a specific value, only the specific size of data identified in a sub-field of the DCBT and addressed in the DCBT instruction is prefetched into an entry in the lower level cache.

摘要翻译： 用于指定用于仅预取高速缓存块数据的子部分的访问提示的系统和方法，用于处理器核心的更有效的系统互连使用。处理单元接收包含访问提示的数据高速缓存块触摸（DCBT）指令，并且识别要预取的数据的特定大小部分。访问提示和对应于要预取的数据量的值都包含在DCBT指令的单独子字段中。响应于检测到代码点被设置为特定值，仅在DCBT指令的DCBT的子字段中标识的数据的特定大小被预取到低级缓存中的条目中。

16.

发明授权
Data replication in multiprocessor NUCA systems to reduce horizontal cache thrashing 失效
标题翻译：多处理器NUCA系统中的数据复制，以减少水平缓存的颠簸

公开(公告)号：US07287122B2

公开(公告)日：2007-10-23

申请号：US10960611

申请日：2004-10-07

申请人： Ramakrishnan Rajamony , Xiaowei Shen , Balaram Sinharoy

发明人： Ramakrishnan Rajamony , Xiaowei Shen , Balaram Sinharoy

IPC分类号： G06F15/163 , G06F13/36

CPC分类号： G06F12/0846 , G06F12/0813 , G06F12/084 , G06F12/122 , G06F2212/271

摘要： A method of managing a distributed cache structure having separate cache banks, by detecting that a given cache line has been repeatedly accessed by two or more processors which share the cache, and replicating that cache line in at least two separate cache banks. The cache line is optimally replicated in a cache bank having the lowest latency with respect to the given accessing processor. A currently accessed line in a different cache bank can be exchanged with a cache line in the cache bank with the lowest latency, and another line in the cache bank with lowest latency is moved to the different cache bank prior to the currently accessed line being moved to the cache bank with the lowest latency. Further replication of the cache line can be disabled when two or more processors alternately write to the cache line.

摘要翻译： 一种通过检测给定的高速缓存行已被共享高速缓存的两个或多个处理器重复访问并且在至少两个单独的高速缓冲存储器中复制该高速缓存行的方式来管理具有单独的高速缓存组的分布式高速缓存结构的方法。高速缓存行被优化地复制到相对于给定访问处理器具有最低延迟的缓存组中。在不同的缓存组中的当前访问的行可以与具有最低延迟的高速缓存组中的高速缓存行交换，并且具有最低延迟的高速缓存组中的另一行在当前访问的行被移动之前移动到不同的高速缓存组以最低的延迟到达缓存库。当两个或更多个处理器交替写入高速缓存行时，可以禁用高速缓存行的进一步复制。

17.

发明授权
Thread partitioning in a multi-core environment 有权
标题翻译：多核环境中的线程分区

公开(公告)号：US08707016B2

公开(公告)日：2014-04-22

申请号：US12024211

申请日：2008-02-01

申请人： Ravi K. Arimilli , Juan C. Rubio , Balaram Sinharoy

发明人： Ravi K. Arimilli , Juan C. Rubio , Balaram Sinharoy

IPC分类号： G06F9/30

CPC分类号： G06F9/4843 , G06F9/3851

摘要： A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. The set of helper thread binaries and the set of main thread binaries are partitioned according to common instruction boundaries. As a first partition in the set of main thread binaries executes within a first core, a second partition in the set of helper thread binaries executes within a second core, thus “warming up” the cache in the second core. When the first partition of the main completes execution, a second partition of the main core moves to the second core, and executes using the warmed up cache in the second core.

摘要翻译： 创建一组辅助线程二进制文件来检索一组主线程二进制文件使用的数据。辅助线程二进制文件集和主线程二进制文件集合根据公共指令边界进行分区。作为主线程二进制文件集合中的第一分区在第一核心内执行，该辅助线程二进制文件集中的第二分区在第二核心内执行，从而“预热”第二核心中的高速缓存。当主要的第一分区完成执行时，主核心的第二分区移动到第二核心，并使用第二核心中的预热高速缓存执行。

18.

发明授权
Hardware assist thread for dynamic performance profiling 失效
标题翻译：用于动态性能分析的硬件辅助线

公开(公告)号：US08612730B2

公开(公告)日：2013-12-17

申请号：US12796124

申请日：2010-06-08

申请人： Ronald P. Hall , Venkat Rajeev Indukuru , Alexander Erik Mericas , Balaram Sinharoy , Zhong Liang Wang

发明人： Ronald P. Hall , Venkat Rajeev Indukuru , Alexander Erik Mericas , Balaram Sinharoy , Zhong Liang Wang

IPC分类号： G06F9/00

CPC分类号： G06F9/3851 , G06F9/3009 , G06F9/327 , G06F11/3466 , G06F2201/865 , G06F2201/88

摘要： A method and data processing system for managing running of instructions in a program. A processor of the data processing system receives a monitoring instruction of a monitoring unit. The processor determines if at least one secondary thread of a set of secondary threads is available for use as an assist thread. The processor selects the at least one secondary thread from the set of secondary threads to become the assist thread in response to a determination that the at least one secondary thread of the set of secondary threads is available for use as an assist thread. The processor changes profiling of running of instructions in the program from the main thread to the assist thread.

摘要翻译： 一种用于管理程序中的指令的运行的方法和数据处理系统。数据处理系统的处理器接收监视单元的监视指令。处理器确定一组辅助线程的至少一个辅助线程是否可用作辅助线程。响应于确定所述一组次要线程的至少一个辅助线程可用作辅助线程，所述处理器从所述辅助线程组中选择所述至少一个辅助线程以成为所述辅助线程。处理器将程序中指令的运行情况从主线程更改为辅助线程。

19.

发明授权
Speculative popcount data creation 有权
标题翻译：投机性的popcount数据创建

公开(公告)号：US08387065B2

公开(公告)日：2013-02-26

申请号：US12425343

申请日：2009-04-16

申请人： Ravi K. Arimilli , Ronald N. Kalla , Balaram Sinharoy

发明人： Ravi K. Arimilli , Ronald N. Kalla , Balaram Sinharoy

IPC分类号： G06F9/46 , G06F9/45 , G06F9/30 , G06F9/40

CPC分类号： G06F9/3001 , G06F9/30018 , G06F9/3842

摘要： A method and a data processing system by which population count (popcount) operations are efficiently performed without incurring the latency and loss of critical processing cycles and bandwidth of real time processing. The method comprises: identifying data to be stored to memory for which a popcount may need to be determined; speculatively performing a popcount operation on the data as a background process of the processor while the data is being stored to memory; storing the data to a first memory location; and storing a value of the popcount generated by the popcount operation within a second memory location. The method further comprises: determining a size of data; determining a granular level at which the popcount operation on the data will be performed; and reserving a size of said second memory location that is sufficiently large to hold the value of the popcount.

摘要翻译： 一种方法和数据处理系统，通过该方法和数据处理系统有效地执行人口计数（popcount）操作，而不会导致关键处理周期的延迟和丢失以及实时处理的带宽。该方法包括：识别要存储到可能需要确定一个弹出窗口的存储器的数据; 在将数据存储到存储器中的情况下，作为处理器的后台处理推测性地对数据进行弹出数据操作; 将数据存储到第一存储器位置; 以及将由所述popcount操作生成的所述popcount的值存储在第二存储器位置内。该方法还包括：确定数据的大小; 确定将执行对数据的弹出数据操作的粒度级别; 以及保留所述第二存储器位置的大小足够大以保持所述用户名的值。

20.

发明授权
Helper thread for pre-fetching data 失效
标题翻译：辅助线程用于预取数据

公开(公告)号：US08359589B2

公开(公告)日：2013-01-22

申请号：US12024191

申请日：2008-02-01

申请人： Ravi K. Arimilli , Juan C. Rubio , Balaram Sinharoy

发明人： Ravi K. Arimilli , Juan C. Rubio , Balaram Sinharoy

IPC分类号： G06F9/44 , G06F9/45 , G06F15/167 , G06F9/30 , G06F9/46

CPC分类号： G06F8/41 , G06F9/383 , G06F9/3851

摘要： A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. If executing a portion of the set of helper thread binaries results in the retrieval of data needed by the set of main thread binaries, then that retrieved data is utilized by the set of main thread binaries.

摘要翻译： 创建一组辅助线程二进制文件来检索一组主线程二进制文件使用的数据。如果执行一组辅助线程二进制文件的一部分导致检索主线程二进制文件集所需的数据，那么该检索的数据由主线程二进制文件集合使用。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类