专利检索 ap:("Sameer Kumar" OR "Amith R. Mamidala" OR "Joseph D. Ratterman" OR "Michael Blocksome" OR "Douglas Miller") AND inv:"Amith R. Mamidala" 第 1 页

1.

发明授权
Mechanism of supporting sub-communicator collectives with O(64) counters as opposed to one counter for each sub-communicator 有权
标题翻译：用O（64）计数器支持子通信集体的机制，而不是每个子通信器的一个计数器

公开(公告)号：US08527740B2

公开(公告)日：2013-09-03

申请号：US12697164

申请日：2010-01-29

申请人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Michael Blocksome , Douglas Miller

发明人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Michael Blocksome , Douglas Miller

IPC分类号： G06F9/30

CPC分类号： G06F9/50 , G06F9/522

摘要： A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.

摘要翻译： 一种用于增强计算机系统上的屏障共同同步的系统和方法，包括包括数据存储装置的计算机系统。计算机系统包括存储在数据存储装置中的程序和由处理器执行的程序的步骤。该系统包括提供多个用于存储用于沐浴算法的状态信息的通信器。每个通信器在计算机系统的多处理器环境中指定主核心。系统为多个线程中的每一个分配或指定一个计数器。系统配置具有等于最大线程数的条目数的表。当进程线程启动集合时，系统设置具有与通信器相关联的ID的表条目。系统通过搜索表中的条目来确定分配的或指定的计数器。

2.

发明申请
MECHANISM OF SUPPORTING SUB-COMMUNICATOR COLLECTIVES WITH O(64) COUNTERS AS OPPOSED TO ONE COUNTER FOR EACH SUB-COMMUNICATOR 有权
标题翻译： O（64）对每个分散通讯员的一个计数器的支持次级通讯员收集机制

公开(公告)号：US20110119468A1

公开(公告)日：2011-05-19

申请号：US12697164

申请日：2010-01-29

申请人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Michael Blocksome , Douglas Miller

发明人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Michael Blocksome , Douglas Miller

IPC分类号： G06F9/30

CPC分类号： G06F9/50 , G06F9/522

摘要： A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.

摘要翻译： 一种用于增强计算机系统上的屏障共同同步的系统和方法，包括包括数据存储装置的计算机系统。计算机系统包括存储在数据存储装置中的程序和由处理器执行的程序的步骤。该系统包括提供多个用于存储用于沐浴算法的状态信息的通信器。每个通信器在计算机系统的多处理器环境中指定主核心。系统为多个线程中的每一个分配或指定一个计数器。系统配置具有等于最大线程数的条目数的表。当进程线程启动集合时，系统设置具有与通信器相关联的ID的表条目。系统通过搜索表中的条目来确定分配的或指定的计数器。

3.

发明授权
Shared address collectives using counter mechanisms 失效
标题翻译：共享地址集合使用计数器机制

公开(公告)号：US08655962B2

公开(公告)日：2014-02-18

申请号：US12568115

申请日：2009-09-28

申请人： Michael Blocksome , Gabor Dozsa , Thomas M. Gooding , Philip Heidelberger , Sameer Kumar , Amith R. Mamidala , Douglas Miller

发明人： Michael Blocksome , Gabor Dozsa , Thomas M. Gooding , Philip Heidelberger , Sameer Kumar , Amith R. Mamidala , Douglas Miller

IPC分类号： G06F15/16 , G06F15/167

CPC分类号： G06F9/544

摘要： A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.

摘要翻译： 计算节点上的共享地址空间存储从网络接收的数据和要发送到网络的数据。共享地址空间包括可以通过多个进程直接操作的应用缓冲器，例如在计算节点上的不同核上运行。共享计数器用于通过在计算节点上运行的多个进程的信令到达的一个或多个，信令完成由多个进程中的一个或多个执行的操作，通过一个或多个多个处理或其组合。

4.

发明申请
SHARED ADDRESS COLLECTIVES USING COUNTER MECHANISMS 失效
标题翻译：使用计数器机制的共享地址集合

公开(公告)号：US20110078249A1

公开(公告)日：2011-03-31

申请号：US12568115

申请日：2009-09-28

申请人： Michael Blocksome , Gabor Dozsa , Thomas M. Gooding , Philip Heidelberger , Sameer Kumar , Amith R. Mamidala , Douglas Miller

发明人： Michael Blocksome , Gabor Dozsa , Thomas M. Gooding , Philip Heidelberger , Sameer Kumar , Amith R. Mamidala , Douglas Miller

IPC分类号： G06F15/16

CPC分类号： G06F9/544

摘要： A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.

摘要翻译： 计算节点上的共享地址空间存储从网络接收的数据和要发送到网络的数据。共享地址空间包括可以通过多个进程直接操作的应用缓冲器，例如在计算节点上的不同核上运行。共享计数器用于通过在计算节点上运行的多个进程的信令到达的一个或多个，信令完成由多个进程中的一个或多个执行的操作，通过一个或多个多个处理或其组合。

5.

发明申请
Processing Posted Receive Commands In A Parallel Computer 有权
标题翻译：处理发布在并行计算机中接收命令

公开(公告)号：US20130312010A1

公开(公告)日：2013-11-21

申请号：US13476571

申请日：2012-05-21

申请人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Brian E. Smith

发明人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Brian E. Smith

IPC分类号： G06F9/54

CPC分类号： G06F9/546 , G06F2209/548

摘要： Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.

摘要翻译： 处理在并行计算机中发送的接收命令，包括：通过计算节点的并行进程发送接收命令，所述接收命令包括除了所述接收命令之外的一组参数，以引导并行发送的接收队列; 将平行的意外消息队列平坦化为单个意外消息队列; 确定所述单个意外消息队列中的条目是否满足所述发布的接收命令; 如果单个意外消息队列中的条目满足发布的接收命令，则处理发布的接收命令; 如果发送的接收命令不满足单个意外消息队列中的条目：将并行发送的接收队列平坦化为单个发送的接收队列; 并将发布的接收命令存储在单个接收队列中。

6.

发明授权
Multi-petascale highly efficient parallel supercomputer 有权
标题翻译：多千兆高效并行超级计算机

公开(公告)号：US09081501B2

公开(公告)日：2015-07-14

申请号：US13004007

申请日：2011-01-10

申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F15/173 , G06F9/06 , G06F15/76

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

摘要翻译： 具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

7.

发明申请
MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER 有权
标题翻译：多层高效平行超级计算机

公开(公告)号：US20110219208A1

公开(公告)日：2011-09-08

申请号：US13004007

申请日：2011-01-10

申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F15/76 , G06F9/06

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

摘要翻译： 具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

8.

发明授权
Processing posted receive commands in a parallel computer 有权
标题翻译：处理在并行计算机中发送接收命令

公开(公告)号：US09158602B2

公开(公告)日：2015-10-13

申请号：US13476571

申请日：2012-05-21

申请人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Brian E. Smith

发明人： Sameer Kumar , Amith R. Mamidala , Joseph D. Ratterman , Brian E. Smith

IPC分类号： G06F9/54

CPC分类号： G06F9/546 , G06F2209/548

摘要： Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.

摘要翻译： 处理在并行计算机中发送的接收命令，包括：通过计算节点的并行进程发送接收命令，所述接收命令包括除了所述接收命令之外的一组参数，以引导并行发送的接收队列; 将平行的意外消息队列平坦化为单个意外消息队列; 确定所述单个意外消息队列中的条目是否满足所述发布的接收命令; 如果单个意外消息队列中的条目满足发布的接收命令，则处理发布的接收命令; 如果发送的接收命令不满足单个意外消息队列中的条目：将并行发送的接收队列平坦化为单个发送的接收队列; 并将发布的接收命令存储在单个接收队列中。

9.

发明授权
Mechanisms for efficient intra-die/intra-chip collective messaging 有权
标题翻译：有效的片内/片内集体消息传递的机制

公开(公告)号：US08904118B2

公开(公告)日：2014-12-02

申请号：US12986528

申请日：2011-01-07

申请人： Amith R. Mamidala , Valentina Salapura , Robert W. Wisniewski

发明人： Amith R. Mamidala , Valentina Salapura , Robert W. Wisniewski

IPC分类号： G06F12/10 , G06F12/08 , G06F15/167

CPC分类号： G06F12/0831 , G06F15/167

摘要： Mechanism of efficient intra-die collective processing across the nodelets with separate shared memory coherency domains is provided. An integrated circuit die may include a hardware collective unit implemented on the integrated circuit die. A plurality of cores on the integrated circuit die is grouped into a plurality of shared memory coherence domains. Each of the plurality of shared memory coherence domains is connected to the collective unit for performing collective operations between the plurality of shared memory coherence domains.

摘要翻译： 提供了具有单独的共享存储器一致性域的节点之间的有效模内集体处理的机制。集成电路管芯可以包括在集成电路管芯上实现的硬件集合单元。集成电路管芯上的多个核被分组成多个共享存储器相干域。多个共享存储器相干域中的每一个连接到集体单元，用于在多个共享存储器相干域之间执行集合操作。

10.

发明申请
MECHANISM FOR OPTIMIZED INTRA-DIE INTER-NODELET MESSAGING COMMUNICATION 有权
标题翻译：优化内部信号通信通信机制

公开(公告)号：US20130326180A1

公开(公告)日：2013-12-05

申请号：US13485074

申请日：2012-05-31

申请人： Amith R. Mamidala , Valentina Salapura , Robert W. Wisniewski

发明人： Amith R. Mamidala , Valentina Salapura , Robert W. Wisniewski

IPC分类号： G06F12/14

CPC分类号： G06F9/544 , G06F15/167

摘要： Point-to-point intra-nodelet messaging support for nodelets on a single chip that obey MPI semantics may be provided. In one aspect, a local buffering mechanism is employed that obeys standard communication protocols for the network communications between the nodelets integrated in a single chip. Sending messages from one nodelet to another nodelet on the same chip may be performed not via the network, but by exchanging messages in the point-to-point messaging buckets between the nodelets. The messaging buckets need not be part of the memory system of the nodelets. Specialized hardware controllers may be used for moving data between the nodelets and each messaging bucket, and ensuring correct operation of the network protocol.

摘要翻译： 可以提供在遵循MPI语义的单个芯片上的节点的点对点节点内消息支持。在一个方面，采用本地缓冲机制，其遵循集成在单个芯片中的节点之间的网络通信的标准通信协议。从同一芯片上的一个节点发送消息到另一个节点可能不是通过网络执行的，而是通过在节点之间的点对点消息存储区中交换消息。消息传递桶不需要是节点的内存系统的一部分。专用硬件控制器可用于在节点和每个消息传送桶之间移动数据，并确保网络协议的正确操作。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类