专利检索 ap:("Alexandre E. Eichenberger" OR "Michael K. Gschwind" OR "John A. Gunnels" OR "James L. McInnes" OR "Mark P. Mendell") AND inv:"Alexandre E. Eichenberger" 第 1 页

1.

发明授权
Systems, methods and computer products for cross-thread scheduling 有权
标题翻译：用于跨线程调度的系统，方法和计算机产品

公开(公告)号：US09223580B2

公开(公告)日：2015-12-29

申请号：US11847556

申请日：2007-08-30

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A Gunnels , James L. McInnes , Mark P. Mendell

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A Gunnels , James L. McInnes , Mark P. Mendell

IPC分类号： G06F9/455 , G06F9/46 , G06F9/38 , G06F9/45

CPC分类号： G06F9/3851 , G06F8/445 , G06F9/3885

摘要： Systems, methods and computer products for cross-thread scheduling. Exemplary embodiments include a cross thread scheduling method for compiling code, the method including scheduling a scheduling unit with a scheduler sub-operation in response to the scheduling unit being in a non-multithreaded part of the code and scheduling the scheduling unit with a cross-thread scheduler sub-operation in response to the scheduling unit being in a multithreaded part of the code.

摘要翻译： 用于跨线程调度的系统，方法和计算机产品。示例性实施例包括用于编译代码的交叉线程调度方法，所述方法包括：响应于所述调度单元处于所述代码的非多线程部分中的调度器子操作来调度调度单元，并且调度所述调度单元，响应于调度单元处于代码的多线程部分中的线程调度器子操作。

2.

发明申请
SYSTEMS, METHODS AND COMPUTER PRODUCTS FOR CROSS-THREAD SCHEDULING 有权
标题翻译：用于交叉螺纹调度的系统，方法和计算机产品

公开(公告)号：US20090064152A1

公开(公告)日：2009-03-05

申请号：US11847556

申请日：2007-08-30

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels , James L. McInnes , Mark P. Mendell

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels , James L. McInnes , Mark P. Mendell

IPC分类号： G06F9/46

CPC分类号： G06F9/3851 , G06F8/445 , G06F9/3885

摘要： Systems, methods and computer products for cross-thread scheduling. Exemplary embodiments include a cross thread scheduling method for compiling code, the method including scheduling a scheduling unit with a scheduler sub-operation in response to the scheduling unit being in a non-multithreaded part of the code and scheduling the scheduling unit with a cross-thread scheduler sub-operation in response to the scheduling unit being in a multithreaded part of the code.

摘要翻译： 用于跨线程调度的系统，方法和计算机产品。示例性实施例包括用于编译代码的交叉线程调度方法，所述方法包括：响应于所述调度单元处于所述代码的非多线程部分中的调度器子操作来调度调度单元，并且调度所述调度单元，响应于调度单元处于代码的多线程部分中的线程调度器子操作。

3.

发明申请
Complex Matrix Multiplication Operations with Data Pre-Conditioning in a High Performance Computing Architecture 失效
标题翻译：在高性能计算架构中使用数据预处理的复杂矩阵乘法运算

公开(公告)号：US20110040822A1

公开(公告)日：2011-02-17

申请号：US12542324

申请日：2009-08-17

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

IPC分类号： G06F17/16 , G06F7/52

CPC分类号： G06F17/16 , G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/30109

摘要： Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.

摘要翻译： 提供了执行复矩阵乘法运算的机制。执行矢量加载操作以将复矩阵乘法运算的第一向量操作数加载到第一目标向量寄存器。第一矢量操作数包括第一复矢量值的实部和虚部。执行复杂的加载和拼接操作以加载第二向量操作数的第二复数向量值，并在第二目标向量寄存器内复制第二复数向量值。第二个复矢量值具有实部和虚部。对第一目标向量寄存器的元素和第二目标向量寄存器的元素执行交叉乘法运算，以生成复矩阵乘法运算的部分乘积。部分产品与其他部分产品一起累积，并将结果积累的部分产品存储在结果向量寄存器中。

4.

发明授权
Matrix multiplication operations using pair-wise load and splat operations 有权

公开(公告)号：US09600281B2

公开(公告)日：2017-03-21

申请号：US12834464

申请日：2010-07-12

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels , Valentina Salapura

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels , Valentina Salapura

IPC分类号： G06F9/30 , G06F9/312 , G06F9/38

CPC分类号： G06F9/30043 , G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30109 , G06F9/30112 , G06F9/30145 , G06F9/3887

摘要： Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.

5.

发明授权
Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture 失效
标题翻译：在高性能计算架构中使用数据预处理的复矩阵乘法运算

公开(公告)号：US08650240B2

公开(公告)日：2014-02-11

申请号：US12542324

申请日：2009-08-17

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

IPC分类号： G06F7/52

CPC分类号： G06F17/16 , G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/30109

摘要： Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.

摘要翻译： 提供了执行复矩阵乘法运算的机制。执行矢量加载操作以将复矩阵乘法运算的第一向量操作数加载到第一目标向量寄存器。第一矢量操作数包括第一复矢量值的实部和虚部。执行复杂的加载和拼接操作以加载第二向量操作数的第二复数向量值，并在第二目标向量寄存器内复制第二复数向量值。第二个复矢量值具有实部和虚部。对第一目标向量寄存器的元素和第二目标向量寄存器的元素执行交叉乘法运算，以生成复矩阵乘法运算的部分乘积。部分产品与其他部分产品一起累积，并将结果积累的部分产品存储在结果向量寄存器中。

6.

发明申请
MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER 有权
标题翻译：多层高效平行超级计算机

公开(公告)号：US20110219208A1

公开(公告)日：2011-09-08

申请号：US13004007

申请日：2011-01-10

申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F15/76 , G06F9/06

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

摘要翻译： 具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

7.

发明申请
Optimized Scalar Promotion with Load and Splat SIMD Instructions 失效
标题翻译：通过加载和Splat SIMD指令优化标量升级

公开(公告)号：US20090307656A1

公开(公告)日：2009-12-10

申请号：US12134495

申请日：2008-06-06

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

IPC分类号： G06F9/44

CPC分类号： G06F8/45

摘要： Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.

摘要翻译： 提供了在单指令多数据（SIMD）引擎上执行的优化标量代码的机制。可以基于原始代码表示中的标量和SIMD操作的标识来确定矢量操作 - 拼接操作的放置。可以修改原始代码表示以基于所确定的向量操作 - 分组操作的放置来插入向量操作 - 拼接操作以生成第一修改代码表示。可以基于第一修改代码表示中的标量和SIMD操作的标识来确定单独的拼接操作的放置。可以修改第一修改代码表示以基于确定的单独splat操作的布局来插入或删除单独的splat操作以生成第二修改代码表示。可以基于SIMD引擎执行的第二修改代码表示来输出SIMD代码。

8.

发明授权
Matrix multiplication operations with data pre-conditioning in a high performance computing architecture 失效
标题翻译：在高性能计算架构中使用数据预处理的矩阵乘法运算

公开(公告)号：US08577950B2

公开(公告)日：2013-11-05

申请号：US12542255

申请日：2009-08-17

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

IPC分类号： G06F7/52

CPC分类号： G06F17/16 , G06F9/30014 , G06F9/30029 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/30072 , G06F9/30094 , G06F9/30109 , G06F9/3013 , G06F9/3885 , G06F9/3887

摘要： Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.

摘要翻译： 提供了用于在高性能计算架构中执行数据预处理的矩阵乘法运算的机制。执行向量加载操作以将矩阵乘法运算的第一向量操作数加载到第一目标向量寄存器。执行加载和拼接操作以加载第二向量操作数的元素并将元素复制到第二目标向量寄存器的多个元素中的每一个。对第一目标向量寄存器的元素和第二目标向量寄存器的元素执行乘法加法运算，以生成矩阵乘法运算的部分乘积。矩阵乘法运算的部分乘积与矩阵乘法运算的其他部分积积累。

9.

发明申请
Optimized Scalar Promotion with Load and Splat SIMD Instructions 失效
标题翻译：通过加载和Splat SIMD指令优化标量升级

公开(公告)号：US20120290816A1

公开(公告)日：2012-11-15

申请号：US13555435

申请日：2012-07-23

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels

IPC分类号： G06F9/30

CPC分类号： G06F8/45

摘要： Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.

摘要翻译： 提供了在单指令多数据（SIMD）引擎上执行的优化标量代码的机制。可以基于原始代码表示中的标量和SIMD操作的标识来确定矢量操作 - 拼接操作的放置。可以修改原始代码表示以基于所确定的向量操作 - 分组操作的放置来插入向量操作 - 拼接操作以生成第一修改代码表示。可以基于第一修改代码表示中的标量和SIMD操作的标识来确定单独的拼接操作的放置。可以修改第一修改代码表示以基于确定的单独splat操作的布局来插入或删除单独的splat操作以生成第二修改代码表示。可以基于SIMD引擎执行的第二修改代码表示来输出SIMD代码。

10.

发明申请
Matrix Multiplication Operations Using Pair-Wise Load and Splat Operations 有权
标题翻译：使用配对加载和Splat操作的矩阵乘法运算

公开(公告)号：US20120011348A1

公开(公告)日：2012-01-12

申请号：US12834464

申请日：2010-07-12

申请人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels , Valentina Salapura

发明人： Alexandre E. Eichenberger , Michael K. Gschwind , John A. Gunnels , Valentina Salapura

IPC分类号： G06F9/302

CPC分类号： G06F9/30043 , G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30109 , G06F9/30112 , G06F9/30145 , G06F9/3887

摘要： Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.

摘要翻译： 提供了执行矩阵乘法运算的机构。执行向量加载操作以将矩阵乘法运算的第一向量操作数加载到第一目标向量寄存器。执行成对的加载和拼接操作以加载第二向量操作数的一对标量值，并在第二目标向量寄存器内复制一对标量值。对第一目标向量寄存器的元素和第二目标向量寄存器的元素执行操作，以生成矩阵乘法运算的部分乘积。部分产品与其他部分产品一起积累，并存储所得累积的部分产品。对于第二向量操作数的第二对标量值可以重复该操作。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类