专利检索 ap:("Ralph E. Bellofatto" OR "Paul W. Coteus" OR "Paul G. Crumley" OR "Alan G. Gara" OR "Mark E. Giampapa" OR "Thomas M. Gooding" OR "Rudolf A. Haring" OR "Mark G. Megerian" OR "Martin Ohmacht" OR "Don D. Reed" OR "Richard A. Swetz" OR "Todd Takken") AND inv:"Alan G. Gara" 第 1 页

1.

发明授权
Power throttling of collections of computing elements 失效
标题翻译：功率节流计算元件的集合

公开(公告)号：US08001401B2

公开(公告)日：2011-08-16

申请号：US11768752

申请日：2007-06-26

申请人： Ralph E. Bellofatto , Paul W. Coteus , Paul G. Crumley , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Mark G. Megerian , Martin Ohmacht , Don D. Reed , Richard A. Swetz , Todd Takken

发明人： Ralph E. Bellofatto , Paul W. Coteus , Paul G. Crumley , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Mark G. Megerian , Martin Ohmacht , Don D. Reed , Richard A. Swetz , Todd Takken

IPC分类号： G06F1/26

CPC分类号： G06F1/3203 , G06F1/206

摘要： An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.

摘要翻译： 用于控制计算机中的电力使用的装置和方法包括与本地控制装置通信的多个计算机，以及向本地控制装置和计算机供电的电源。多个传感器与计算机通信以确定计算机的功率使用，并且系统控制装置与计算机通信以控制计算机的电力使用。

2.

发明申请
POWER THROTTLING OF COLLECTIONS OF COMPUTING ELEMENTS 失效
标题翻译：计算元素收集的功率曲线

公开(公告)号：US20090006873A1

公开(公告)日：2009-01-01

申请号：US11768752

申请日：2007-06-26

申请人： Ralph E. Bellofatto , Paul W. Coteus , Paul G. Crumley , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf Haring , Mark G. Megerian , Martin Ohmacht , Don D. Reed , Richard A. Swetz , Todd Takken

发明人： Ralph E. Bellofatto , Paul W. Coteus , Paul G. Crumley , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf Haring , Mark G. Megerian , Martin Ohmacht , Don D. Reed , Richard A. Swetz , Todd Takken

IPC分类号： G06F1/26

CPC分类号： G06F1/3203 , G06F1/206

摘要： An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.

摘要翻译： 用于控制计算机中的电力使用的装置和方法包括与本地控制装置通信的多个计算机，以及向本地控制装置和计算机供电的电源。多个传感器与计算机通信以确定计算机的功率使用，并且系统控制装置与计算机通信以控制计算机的电力使用。

3.

发明授权
Method and apparatus to debug an integrated circuit chip via synchronous clock stop and scan 失效
标题翻译：通过同步时钟停止和扫描来调试集成电路芯片的方法和装置

公开(公告)号：US08140925B2

公开(公告)日：2012-03-20

申请号：US11768791

申请日：2007-06-26

申请人： Ralph E. Bellofatto , Matthew R. Ellavsky , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Lance G. Hehenberger , Martin Ohmacht

发明人： Ralph E. Bellofatto , Matthew R. Ellavsky , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Lance G. Hehenberger , Martin Ohmacht

IPC分类号： G01R31/28 , G06F1/12

CPC分类号： G06F11/2236

摘要： An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state. The apparatus and methodology enables construction of a cycle-by-cycle view of any part of the state of a running IC chip, using a combination of on-chip circuitry and software.

摘要翻译： 一种用于评估电子或集成电路（IC）的状态的装置和方法，每个IC包括用于控制IC子单元的操作的一个或多个处理器元件，以及每个支持多个时钟域的IC。该方法包括：根据确定的定时配置，产生与一个或多个IC子单元相对应的用于开始一个或多个IC子单元的操作的同步的使能信号组; 计数，响应于同步的一组使能信号的一个信号，多个主处理器IC时钟周期; 并且在获得期望的时钟周期数时，产生用于每个唯一频率时钟域的停止信号以同步地停止每个相应频率时钟域的功能时钟; 并且在确定性地同时停止所有频率时钟域上的所有片上功能时钟时，以期望的IC芯片状态扫描数据值。该装置和方法使得能够使用片上电路和软件的组合来构建运行中的IC芯片的状态的任何部分的逐周期视图。

4.

发明申请
METHOD AND APPARATUS TO DEBUG AN INTEGRATED CIRCUIT CHIP VIA SYNCHRONOUS CLOCK STOP AND SCAN 失效
标题翻译：通过同步时钟停止和扫描来调试集成电路芯片的方法和设备

公开(公告)号：US20090006894A1

公开(公告)日：2009-01-01

申请号：US11768791

申请日：2007-06-26

申请人： Ralph E. Bellofatto , Matthew R. Ellavsky , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Lance G. Hehenberger , Martin Ohmacht

发明人： Ralph E. Bellofatto , Matthew R. Ellavsky , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Lance G. Hehenberger , Martin Ohmacht

IPC分类号： G06F11/00

CPC分类号： G06F11/2236

摘要： An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state. The apparatus and methodology enables construction of a cycle-by-cycle view of any part of the state of a running IC chip, using a combination of on-chip circuitry and software.

摘要翻译： 一种用于评估电子或集成电路（IC）的状态的装置和方法，每个IC包括用于控制IC子单元的操作的一个或多个处理器元件，以及每个支持多个时钟域的IC。该方法包括：根据确定的定时配置，产生与一个或多个IC子单元相对应的用于开始一个或多个IC子单元的操作的同步的使能信号组; 计数，响应于同步的一组使能信号的一个信号，多个主处理器IC时钟周期; 并且在获得期望的时钟周期数时，产生用于每个唯一频率时钟域的停止信号以同步地停止每个相应频率时钟域的功能时钟; 并且在确定性地同时停止所有频率时钟域上的所有片上功能时钟时，以期望的IC芯片状态扫描数据值。该装置和方法使得能够使用片上电路和软件的组合来构建运行中的IC芯片的状态的任何部分的逐周期视图。

5.

发明授权
Ultrascalable petaflop parallel supercomputer 失效
标题翻译：超平面petaflop平行超级计算机

公开(公告)号：US07761687B2

公开(公告)日：2010-07-20

申请号：US11768905

申请日：2007-06-26

申请人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

发明人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

IPC分类号： G06F15/173

CPC分类号： G06F15/17337

摘要： A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

摘要翻译： petaOPS规模的大规模并行超级计算机包括基于片上系统技术的节点架构，其中每个处理节点包括具有多达四个处理元件的单个专用集成电路（ASIC）。 ASIC节点通过多个独立网络互连，以最小的延迟最大化节点之间的数据包通信的吞吐量。多个网络可以包括用于并行算法消息传递的三个高速网络，包括Torus，集合网络和提供全局障碍和通知功能的全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。提供DMA引擎的使用以促进节点之间的消息传递，而不需要节点处理资源。

6.

发明申请
ULTRASCALABLE PETAFLOP PARALLEL SUPERCOMPUTER 失效
标题翻译：超声波PETAFLOP并行超级计算机

公开(公告)号：US20090006808A1

公开(公告)日：2009-01-01

申请号：US11768905

申请日：2007-06-26

申请人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

发明人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

IPC分类号： G06F15/80 , G06F9/06

CPC分类号： G06F15/17337

摘要： A novel massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. Novel use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

摘要翻译： petaOPS规模的一种新型大规模并行超级计算机包括基于片上系统技术的节点架构，其中每个处理节点包括具有多达四个处理元件的单个专用集成电路（ASIC）。 ASIC节点通过多个独立网络互连，以最小的延迟最大化节点之间的数据包通信的吞吐量。多个网络可以包括用于并行算法消息传递的三个高速网络，包括Torus，集合网络和提供全局障碍和通知功能的全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。提供了新的使用DMA引擎来促进节点之间的消息传递，而不需要节点处理资源。

7.

发明授权
Configurable memory system and method for providing atomic counting operations in a memory device 有权
标题翻译：可配置的存储器系统和方法，用于在存储器件中提供原子计数操作

公开(公告)号：US07797503B2

公开(公告)日：2010-09-14

申请号：US11768812

申请日：2007-06-26

申请人： Ralph E. Bellofatto , Alan G. Gara , Mark E. Giampapa , Martin Ohmacht

发明人： Ralph E. Bellofatto , Alan G. Gara , Mark E. Giampapa , Martin Ohmacht

IPC分类号： G06F12/00

CPC分类号： G06F12/10 , G06F9/3004 , G06F9/30185 , G06F9/3834 , G06F9/3851 , G06F9/3861 , G06F12/0292 , G06F12/1027 , G06F2212/1044 , G06F2212/206

摘要： A memory system and method for providing atomic memory-based counter operations to operating systems and applications that make most efficient use of counter-backing memory and virtual and physical address space, while simplifying operating system memory management, and enabling the counter-backing memory to be used for purposes other than counter-backing storage when desired. The encoding and address decoding enabled by the invention provides all this functionality through a combination of software and hardware.

摘要翻译： 一种用于向操作系统和应用提供基于原子的存储器的计数器操作的存储器系统和方法，所述操作系统和应用可最有效地利用反向存储器和虚拟和物理地址空间，同时简化操作系统存储器管理，在需要的时候用于除背板存储之外的目的。通过本发明实现的编码和地址解码通过软件和硬件的组合来提供所有这些功能。

8.

发明申请
CONFIGURABLE MEMORY SYSTEM AND METHOD FOR PROVIDING ATOMIC COUNTING OPERATIONS IN A MEMORY DEVICE 有权
标题翻译：用于在存储器件中提供原始计数操作的可配置存储器系统和方法

公开(公告)号：US20090006800A1

公开(公告)日：2009-01-01

申请号：US11768812

申请日：2007-06-26

申请人： Ralph E. Bellofatto , Alan G. Gara , Mark E. Giampapa , Martin Ohmacht

发明人： Ralph E. Bellofatto , Alan G. Gara , Mark E. Giampapa , Martin Ohmacht

IPC分类号： G06F12/00

CPC分类号： G06F12/10 , G06F9/3004 , G06F9/30185 , G06F9/3834 , G06F9/3851 , G06F9/3861 , G06F12/0292 , G06F12/1027 , G06F2212/1044 , G06F2212/206

摘要： A memory system and method for providing atomic memory-based counter operations to operating systems and applications that make most efficient use of counter-backing memory and virtual and physical address space, while simplifying operating system memory management, and enabling the counter-backing memory to be used for purposes other than counter-backing storage when desired. The encoding and address decoding enabled by the invention provides all this functionality through a combination of software and hardware.

摘要翻译： 一种用于向操作系统和应用提供基于原子的存储器的计数器操作的存储器系统和方法，所述操作系统和应用可最有效地利用反向存储器和虚拟和物理地址空间，同时简化操作系统存储器管理，在需要的时候用于除背板存储之外的目的。通过本发明实现的编码和地址解码通过软件和硬件的组合来提供所有这些功能。

9.

发明授权
Low latency memory access and synchronization 失效
标题翻译：低延迟内存访问和同步

公开(公告)号：US07174434B2

公开(公告)日：2007-02-06

申请号：US10468994

申请日：2002-02-25

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

IPC分类号： G06F12/12

CPC分类号： G06F9/52

摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

摘要翻译： 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的每个处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

10.

发明授权
Simplifying and speeding the management of intra-node cache coherence 失效
标题翻译：简化和加快节点内缓存一致性管理

公开(公告)号：US08161248B2

公开(公告)日：2012-04-17

申请号：US12953770

申请日：2010-11-24

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Phillip Heidelberger , Dirk Hoenicke , Martin Ohmacht

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Phillip Heidelberger , Dirk Hoenicke , Martin Ohmacht

IPC分类号： G06F12/00 , G06F13/00 , G06F13/28 , G06F15/167

CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338

摘要： A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

摘要翻译： 一种用于管理多处理器计算机系统的两个处理器节点的两个处理器之间的相干性的方法和装置。通常，本发明涉及一种软件算法，其简化并显着加速了传送并行计算机的消息中的高速缓存一致性的管理以及辅助该高速缓存一致性算法的硬件设备。软件算法使用put / get窗口的打开和关闭来协调激活的所需要的，以实现缓存一致性。硬件设备可以是硬件地址解码的扩展，其在节点的物理存储器地址空间中创建（a）实际不存在的虚拟存储器的区域，并且（b）因此能够立即响应从处理元素读取和写入请求。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类