专利检索 ap:("John Gunnels" OR "Fred Gustavson" OR "Robert Engle") AND inv:"Fred Gustavson" 第 1 页

1.

发明申请
System and method for detecting a faulty object in a system 失效

公开(公告)号：US20060179340A1

公开(公告)日：2006-08-10

申请号：US11050945

申请日：2005-02-07

申请人： John Gunnels , Fred Gustavson , Robert Engle

发明人： John Gunnels , Fred Gustavson , Robert Engle

IPC分类号： G06F11/00

CPC分类号： G06F11/0751

摘要： A method (and system) for detecting at least one faulty object in a system including a plurality of objects in communication with each other in an n-dimensional architecture, includes probing a first plane of objects in the n-dimensional architecture and probing at least one other plane of objects in the n-dimensional architecture which would result in identifying a faulty object in the system.

2.

发明申请
Method and structure for an improved data reformatting procedure 失效
标题翻译：改进数据重新格式化程序的方法和结构

公开(公告)号：US20070162703A1

公开(公告)日：2007-07-12

申请号：US11328344

申请日：2006-01-09

申请人： Siddhartha Chatterjee , John Gunnels , Fred Gustavson

发明人： Siddhartha Chatterjee , John Gunnels , Fred Gustavson

IPC分类号： G06F12/00 , G06F12/14

CPC分类号： G06F12/0804 , G06F9/30047 , G06F12/0802

摘要： A method (and structure) of managing memory in which a low-level mechanism is executed to signal, in a sequence of instructions generated at a higher level, that at least a portion of a contiguous area of memory is permitted to be overwritten.

摘要翻译： 一种管理存储器的方法（和结构），其中执行低级机制以在较高级别生成的指令序列中信号，允许覆盖存储器的连续区域的至少一部分被覆盖。

3.

发明申请
Method and structure for algorithmic overlap in parallel processing for exploitation when load imbalance is dynamic and predictable 审中-公开
标题翻译：并行处理中的算法重叠的方法和结构，当负载不平衡是动态和可预测的时候

公开(公告)号：US20060167836A1

公开(公告)日：2006-07-27

申请号：US11039907

申请日：2005-01-24

申请人： Siddhartha Chatterjee , John Gunnels , Fred Gustavson

发明人： Siddhartha Chatterjee , John Gunnels , Fred Gustavson

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F9/5083

摘要： A method (and structure) of processing, on a computer having a plurality of processors, includes executing a set of tasks that includes a computational bottleneck in a repetitive procedure on a first subset of the plurality of processors. A set of non-bottleneck tasks of the repetitive procedure is executed on a second subset of the plurality of processors. In a steady-state processing of the repetitive procedure, the first subset of processors and the second subset of processors are together processing the repetitive procedure in a manner such that the first subset of processors and the second subset of processors are each operating substantially full-time.

摘要翻译： 在具有多个处理器的计算机上进行处理的方法（和结构）包括在多个处理器的第一子集上的重复过程中执行包括计算瓶颈的一组任务。在多个处理器的第二子集上执行重复过程的一组非瓶颈任务。在重复过程的稳态处理中，处理器的第一子集和处理器的第二子集合共同地以这样的方式处理重复过程，使得处理器的第一子集和处理器的第二子集各自基本上全部运行，时间。

4.

发明申请
Method and structure for producing high performance linear algebra routines using level 3 prefetching for kernel routines 审中-公开
标题翻译：用于使用内核程序的3级预取来生成高性能线性代数程序的方法和结构

公开(公告)号：US20050071405A1

公开(公告)日：2005-03-31

申请号：US10671889

申请日：2003-09-29

申请人： Fred Gustavson , John Gunnels

发明人： Fred Gustavson , John Gunnels

IPC分类号： G06F7/38 , G06F7/483 , G06F9/38 , G06F17/16

CPC分类号： G06F9/383 , G06F7/483 , G06F9/3001 , G06F17/16

摘要： A method (and structure) for executing linear algebra subroutines includes, for an execution code controlling an operation of a floating point unit (FPU) performing a linear algebra subroutine execution, unrolling instructions to prefetch data into a cache providing data into the FPU. The unrolling causes the instructions to touch data anticipated for the linear algebra subroutine execution.

摘要翻译： 用于执行线性代数子程序的方法（和结构）包括：对执行线性代数子程序执行的浮点单元（FPU）的操作的执行代码，包括用于将数据预取到高速缓存中以提取数据到FPU的展开指令。展开导致指令触摸线性代数子程序执行预期的数据。

5.

发明申请
Method and structure for improving processing efficiency in parallel processing machines for rectangular and triangular matrix routines 审中-公开
标题翻译：用于提高矩形和三角矩阵程序的并行处理机的处理效率的方法和结构

公开(公告)号：US20060265445A1

公开(公告)日：2006-11-23

申请号：US11133254

申请日：2005-05-20

申请人： Fred Gustavson , John Gunnels

发明人： Fred Gustavson , John Gunnels

IPC分类号： G06F7/32

CPC分类号： G06F17/16

摘要： A computerized method (and structure) of linear algebra processing on a computer having a plurality of processors for parallel processing, includes, for a matrix having elements originally stored in a memory in a rectangular matrix AR or especially of one of a triangular matrix AT format and a symmetric matrix AS format, distributing data of the rectangular AR or triangular or symmetric matrix (AT, AS) from the memory to the plurality of processors in such a manner that keeps all submatrices of AR or substantially only essential data of the triangular matrix AT or symmetric matrix AS is represented in the distributed memories of the processors as contiguous atomic units for the processing. The linear algebra processing done on the processors with distributed memories requires that submatrices be sent and received as contiguous atomic units based on the prescribed block cyclic data layouts of the linear algebra processing. This computerized method (and structure) defines all of its submatrices as these contiguous atomic units, thereby avoiding extra data preparation before each send and after each receive. The essential data or AT or AS is that data of the triangular or symmetric matrix that is minimally necessary for maintaining the full information content of the triangular AT or symmetric matrix AS.

摘要翻译： 在具有用于并行处理的多个处理器的计算机上的线性代数处理的计算机化方法（和结构）包括：具有原始存储在矩形矩阵AR或特别是三角矩阵AT格式之一的存储器中的元素的矩阵以及对称矩阵AS格式，将矩阵AR或三角形或对称矩阵（AT，AS）的数据从存储器分配到多个处理器，使得AR的所有子矩阵或基本上只有三角矩阵的基本数据 AT或对称矩阵AS在处理器的分布式存储器中被表示为用于处理的连续原子单元。在具有分布式存储器的处理器上进行的线性代数处理需要基于线性代数处理的规定块循环数据布局将子矩阵作为连续原子单元发送和接收。该计算机化方法（和结构）将其所有子矩阵定义为这些连续的原子单元，从而避免在每次发送之前和之后每次接收时额外的数据准备。基本数据或AT或AS是维持三角形AT或对称矩阵AS的完整信息内容所需的三角形或对称矩阵的数据。

6.

发明申请
Method and structure for a generalized cache-register file interface with data restructuring methods for multiple cache levels and hardware pre-fetching 审中-公开
标题翻译：广义缓存寄存器文件接口的方法和结构，具有用于多个高速缓存级别和硬件预取的数据重组方法

公开(公告)号：US20060161612A1

公开(公告)日：2006-07-20

申请号：US11035902

申请日：2005-01-14

申请人： Fred Gustavson , John Gunnels , James Sexton

发明人： Fred Gustavson , John Gunnels , James Sexton

IPC分类号： G06F7/38

CPC分类号： G06F17/16 , G06F12/0862 , G06F12/0897

摘要： A method and structure for executing a matrix algorithm requiring an order of N3 operations including data reformatting operations, where N is a dimension of an operand of said algorithm on a computer, includes initially reformatting data for at least one matrix used in the matrix algorithm into a data structure stored in a memory, such that stride one data is presented for all submatrices used as operands involved in the matrix algorithm in a format required by the matrix algorithm with substantially no further data re-formatting beyond an order N data re-formatting required for executing the algorithm.

摘要翻译： 一种用于执行需要包括数据重新格式化操作的N次序操作的矩阵算法的方法和结构，其中N是计算机上的所述算法的操作数的维度，包括至少重新格式化数据至少将矩阵算法中使用的一个矩阵转换为存储在存储器中的数据结构，从而以矩阵算法所要求的格式，以用作矩阵算法中所涉及的操作数的所有子矩阵，呈现一个数据，基本上不再进一步进行数据重新格式化超出执行算法所需的N次数据重新格式化。

7.

发明申请
System and method for algorithmic cache-bypass 审中-公开
标题翻译：用于算法缓存旁路的系统和方法

公开(公告)号：US20060179240A1

公开(公告)日：2006-08-10

申请号：US11052877

申请日：2005-02-09

申请人： Siddhartha Chatterjee , John Gunnels , Fred Gustavson

发明人： Siddhartha Chatterjee , John Gunnels , Fred Gustavson

IPC分类号： G06F13/28

CPC分类号： G06F12/0897 , G06F12/0888

摘要： A system for (and method of) algorithmic cache-bypass which includes acting on at least one level of cache to at least one of bypass the at least one level of cache, stream through the at least one level of cache, force utilization of at least one other level of cache, bypass at least one level of cache, bypass all levels of cache, force utilization of a main memory, and force utilization of an out-of core memory.

摘要翻译： 一种用于（和）方法的算法高速缓存绕过系统，其包括对至少一个级别的缓存执行至少一个旁路至少一级的缓存，流过所述至少一级缓存，强制利用at 至少一个其他级别的缓存，绕过至少一个级别的缓存，绕过所有级别的高速缓存，强制利用主内存，以及强制利用核心内存。

8.

发明申请
Method and structure for producing high performance linear algebra routines using a selectable one of six possible level 3 L1 kernel routines 失效
标题翻译：使用六个可能的3级L1内核程序中的可选择的一个来生成高性能线性代数程序的方法和结构

公开(公告)号：US20050071411A1

公开(公告)日：2005-03-31

申请号：US10671935

申请日：2003-09-29

申请人： Fred Gustavson , John Gunnels

发明人： Fred Gustavson , John Gunnels

IPC分类号： G06F7/38 , G06F17/16

CPC分类号： G06F17/16 , G06F9/30014

摘要： A method (and structure) for executing linear algebra subroutines on a computer, including selecting a matrix subroutine from among a plurality of matrix subroutines that performs the matrix multiplication.

摘要翻译： 一种用于在计算机上执行线性代数子程序的方法（和结构），包括从执行矩阵乘法的多个矩阵子程序中选择矩阵子程序。

9.

发明申请
Method and structure for producing high performance linear algebra routines using register block data format routines 失效
标题翻译：使用寄存器块数据格式例程生成高性能线性代数程序的方法和结构

公开(公告)号：US20050071409A1

公开(公告)日：2005-03-31

申请号：US10671888

申请日：2003-09-29

申请人： Fred Gustavson , John Gunnels , James Sexton

发明人： Fred Gustavson , John Gunnels , James Sexton

IPC分类号： G06F12/00 , G06F12/08 , G06F17/16

CPC分类号： G06F12/0875 , G06F17/16

摘要： A method (and structure) of executing a matrix operation, includes, for a matrix A, separating the matrix A into blocks, each block having a size p-by-q. The blocks of size p-by-q are then stored in a cache or memory in at least one of the two following ways. The elements in at least one of the blocks is stored in a format in which elements of the block occupy a location different from an original location in the block, and/or the blocks of size p-by-q are stored in a format in which at least one block occupies a position different relative to its original position in the matrix A.

摘要翻译： 执行矩阵运算的方法（和结构）包括对于矩阵A，将矩阵A分成块，每个块具有大小p-by-q。然后以p-by-q的大小的块以以下两种方式中的至少一种存储在高速缓存或存储器中。至少一个块中的元素以块的元素占据与块中的原始位置不同的位置的格式存储，和/或大小为p-by-q的块以其中至少一个块占据与矩阵A中其原始位置不同的位置。

10.

发明申请
Method and structure for a hybrid full-packed storage format as a single rectangular format data structure 审中-公开
标题翻译：作为单个矩形格式数据结构的混合全封装存储格式的方法和结构

公开(公告)号：US20060173947A1

公开(公告)日：2006-08-03

申请号：US11045354

申请日：2005-01-31

申请人： Fred Gustavson , John Gunnels

发明人： Fred Gustavson , John Gunnels

IPC分类号： G06F7/52

CPC分类号： G06F17/16

摘要： A method (and structure) of linear algebra processing, includes processing a (real or complex) matrix data having elements originally stored in one of a triangular format and a symmetric matrix format in a subroutine designed to process matrix data in a full format. The processing uses a hybrid full packed data structure, which provides a rectangular space characteristic of the full format. The rectangular space is defined by a leading dimension (LD). Inside of the rectangular space are stored a plurality of entities that include all elements of the matrix data originally stored in the triangular or symmetric format.

摘要翻译： 线性代数处理的方法（和结构）包括在设计成以全格式处理矩阵数据的子程序中处理具有原始存储在三角形格式和对称矩阵格式之一的元素的（实数或复数）矩阵数据。该处理使用混合全包数据结构，其提供了完整格式的矩形空间特征。矩形空间由前导维（LD）定义。在矩形空间的内部存储多个实体，其包括原始以三角形或对称格式存储的矩阵数据的所有元素。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类