专利检索 ap:("Chandra M. R. Thimmannagari" OR "Sorin Iacobovici" OR "Rabin A. Sugumar" OR "Robert Nuckolls") AND inv:"Rabin A. Sugumar" 第 1 页

1.

发明授权
Method and a system for using same set of registers to handle both single and double precision floating point instructions in an instruction stream 有权
标题翻译：方法和系统，用于使用相同的寄存器组来处理指令流中的单精度和双精度浮点指令

公开(公告)号：US07191316B2

公开(公告)日：2007-03-13

申请号：US10353662

申请日：2003-01-29

申请人： Rabin A. Sugumar , Sorin Iacobovici , Robert Nuckolls , Chandra M. R. Thimmannagari

发明人： Rabin A. Sugumar , Sorin Iacobovici , Robert Nuckolls , Chandra M. R. Thimmannagari

IPC分类号： G06F9/30 , G06F9/40 , G06F15/00

CPC分类号： G06F9/3836 , G06F9/30014 , G06F9/30101 , G06F9/384 , G06F9/3857

摘要： A system for handling a plurality of single precision floating point instructions and a plurality of double precision floating point instructions that both index a same set of registers is provided. The system comprises a decode unit arranged to decode, stall, and forward at least one of the plurality of single precision and at least one of the plurality of double precision floating point instructions in a fetch group. The decode unit includes a first counter arranged to increment for each of the plurality of single precision floating point instructions forwarded down a pipeline; a second counter arranged to increment for each of the plurality of double precision floating point instructions forwarded down the pipeline; a first mask register and a second mask register. The first mask register is updated by each of the single precision floating point instructions forwarded and the second mask register is updated by each of the double precision floating point instructions forwarded.

摘要翻译： 提供了一种用于处理多个单精度浮点指令和多个双精度浮点指令的系统，它们都对同一组寄存器进行索引。该系统包括解码单元，其被配置为在取出组中解码，停止和转发多个单精度和至少一个双精度浮点指令中的至少一个。所述解码单元包括第一计数器，所述第一计数器被布置为针对沿管线转发的所述多个单精度浮点指令中的每一者递增; 第二计数器，被布置为针对沿着流水线转发的多个双精度浮点指令中的每一个递增; 第一屏蔽寄存器和第二掩码寄存器。通过转发的每个单精度浮点指令来更新第一个掩码寄存器，并且通过转发的每个双精度浮点指令更新第二个掩码寄存器。

2.

发明授权
Register window flattening logic for dependency checking among instructions 有权
标题翻译：注册窗口展平逻辑，用于指令之间的依赖关系检查

公开(公告)号：US07080237B2

公开(公告)日：2006-07-18

申请号：US10155391

申请日：2002-05-24

申请人： Chandra M. R. Thimmannagari , Sorin Iacobovici , Rabin A. Sugumar , Robert Nuckolls

发明人： Chandra M. R. Thimmannagari , Sorin Iacobovici , Rabin A. Sugumar , Robert Nuckolls

IPC分类号： G06F9/30

CPC分类号： G06F9/30127 , G06F9/384

摘要： A technique for flattening architectural register windows into flattened space depending on a current window pointer to a register window is provided. The technique involves converting an n-bit value of a particular register in a register window to an x-bit value dependent on the current window pointer, where x is greater than n, and where the x-bit value is used for register dependency checking among a plurality of instructions.

摘要翻译： 提供了一种根据当前指向寄存器窗口的窗口指针将架构寄存器窗口平坦化为平坦化空间的技术。该技术涉及将寄存器窗口中的特定寄存器的n位值转换为取决于当前窗口指针的x位值，其中x大于n，并且其中x位值用于寄存器依赖性检查在多个指令中。

3.

发明授权
Method and apparatus for processing a complex instruction for execution and retirement 有权
标题翻译：处理执行和退休的复杂指令的方法和装置

公开(公告)号：US07124284B2

公开(公告)日：2006-10-17

申请号：US10337056

申请日：2003-01-06

申请人： Rabin A. Sugumar , Sorin Iacobovici , Chandra M. R. Thimmannagari

发明人： Rabin A. Sugumar , Sorin Iacobovici , Chandra M. R. Thimmannagari

IPC分类号： G06F9/30 , G06F9/40 , G06F15/00

CPC分类号： G06F9/3017 , G06F9/3013 , G06F9/3836 , G06F9/384 , G06F9/3851 , G06F9/3857

摘要： A method and apparatus to determine readiness of a complex instruction for retirement includes decoding a complex instruction into a plurality of helper instructions; executing the plurality of helper instructions using an execution unit; indicating the plurality of helper instructions that are alive using a live instruction register; and maintaining a complex instruction identification for the complex instruction using a complex instruction identification register.

摘要翻译： 确定用于退休的复杂指令的准备状态的方法和装置包括将复杂指令解码为多个辅助指令; 使用执行单元执行所述多个帮助指令; 指示使用实时指令寄存器存在的多个辅助指令; 并使用复杂指令识别寄存器维持复杂指令的复杂指令识别。

4.

发明授权
Method for handling condition code modifiers in an out-of-order multi-issue multi-stranded processor 有权
标题翻译：用于处理无序多问题多链处理器中条件码修改器的方法

公开(公告)号：US07065635B1

公开(公告)日：2006-06-20

申请号：US10738576

申请日：2003-12-17

申请人： Rabin A. Sugumar , Sorin Iacobovici , Chandra M. R. Thimmannagari

发明人： Rabin A. Sugumar , Sorin Iacobovici , Chandra M. R. Thimmannagari

IPC分类号： G06F9/38

CPC分类号： G06F9/3842 , G06F9/30094 , G06F9/3013 , G06F9/384 , G06F9/3851

摘要： A technique for handling a condition code modifying instruction in an out-of-order multi-stranded processor involves providing a condition code architectural register file for each strand, providing a condition code working register file, and assigning condition code architectural register file identification information (CARF_ID) and condition code working register file identification information (CWRF_ID) to the condition code modifying instruction. CARF_ID is used to index a location in a condition code rename table to which the CWRF_ID is stored. Thereafter, upon an exception-free execution of the condition code modifying instruction, a result of the execution is copied from the condition code working register file to the condition code architectural register file dependent on CARF_ID, CWRF_ID, register type information, and strand identification information.

摘要翻译： 一种用于处理无序多股处理器中的条件代码修改指令的技术包括为每条链提供条件代码体系结构寄存器文件，提供条件代码工作寄存器文件，以及分配条件码架构寄存器文件识别信息（ CARF_ID）和条件代码工作寄存器文件识别信息（CWRF_ID）到条件代码修改指令。 CARF_ID用于索引存储CWRF_ID的条件代码重命名表中的位置。此后，在条件代码修改指令的无异常执行时，执行结果从条件代码工作寄存器文件复制到依赖于CARF_ID，CWRF_ID，寄存器类型信息和链标识信息的条件代码架构寄存器文件。

5.

发明授权
Branch prediction structure with branch direction entries that share branch prediction qualifier entries 有权
标题翻译：具有共享分支预测限定符条目的分支方向条目的分支预测结构

公开(公告)号：US07380110B1

公开(公告)日：2008-05-27

申请号：US10660169

申请日：2003-09-11

申请人： Robert D. Nuckolls , Rabin A. Sugumar , Chandra M. R. Thimmannagari

发明人： Robert D. Nuckolls , Rabin A. Sugumar , Chandra M. R. Thimmannagari

IPC分类号： G06F9/40 , G06F9/44

CPC分类号： G06F9/3848

摘要： An efficient branch prediction structure is described that bifurcates a branch prediction structure into at least two portions where information stored in the second portion is aliased amongst multiple entries of the first portion. In this way, overall storage (and layout area) can be reduced and scaling with a branch prediction structure that includes a (2N)K×1 branch direction entries and a (N/2)K×1 branch prediction qualifier entries is less dramatic than conventional techniques. An efficient branch prediction structure includes entries for branch direction indications and entries for branch prediction qualifier indications. The branch direction indication entries are more numerous than the branch prediction qualifier entries. An entry from the branch direction entries is selected based at least in part on a corresponding instruction instance identifier and an entry from the branch prediction qualifier entries is selected based at least in part on least significant bits of the instruction instance identifier.

摘要翻译： 描述了一种有效的分支预测结构，其将分支预测结构分成至少两个部分，其中存储在第二部分中的信息在第一部分的多个条目之中进行混叠。以这种方式，可以减少总体存储（和布局面积），并且使用包括（2N）Kx1分支方向条目和（N / 2）Kx1分支预测限定符条目的分支预测结构进行缩放比常规技术更不显着。有效的分支预测结构包括用于分支方向指示的条目和用于分支预测限定符指示的条目。分支方向指示条目比分支预测限定符条目更多。至少部分地基于对应的指令实例标识符来选择来自分支方向条目的条目，并且至少部分地基于指令实例标识符的最低有效位来选择来自分支预测限定符条目的条目。

6.

发明申请
OFFLOADING OPERATIONS FOR MAINTAINING DATA COHERENCE ACROSS A PLURALITY OF NODES 审中-公开
标题翻译：维护多个节点间的数据协调的卸载操作

公开(公告)号：US20080065835A1

公开(公告)日：2008-03-13

申请号：US11530799

申请日：2006-09-11

申请人： Sorin Iacobovici , Rabin A. Sugumar

发明人： Sorin Iacobovici , Rabin A. Sugumar

IPC分类号： G06F13/00

CPC分类号： G06F12/0817 , G06F12/0866 , H04L69/16 , H04L69/161

摘要： Offloading data coherence operations from a primary processing unit(s) executing instantiated code responsible for data coherence in a shared-cache cluster to a data coherence offload engine reduces resource consumption and allows for efficient sharing of data in accordance with the data coherence protocol. Some of the data coherence operations, such as consulting and maintaining a directory, generating messages, and writing a data unit can be performed by a data coherence offload engine. The data coherence offload engine indicates availability of the data unit in the memory to the appropriate instantiated code. Hence, the instantiated code (the corresponding primary processing unit) is no longer burdened with some of the work load of data coherence operations. Migration of tasks from a primary processing unit(s) to data coherence offload engines allows for efficient retrieval and writing of a requested data unit.

摘要翻译： 将执行共享高速缓存集群中的数据一致性的实例化代码的主处理单元卸载到数据一致性卸载引擎，从而减少资源消耗，并允许根据数据一致性协议有效地共享数据。数据一致性卸载引擎可以执行一些数据一致性操作，例如查询和维护目录，生成消息以及写入数据单元。数据相干卸载引擎指示存储器中的数据单元到适当的实例化代码的可用性。因此，实例化代码（相应的主处理单元）不再受数据一致性操作的一些工作负载的影响。将任务从主处理单元迁移到数据一致性卸载引擎允许有效地检索和写入请求的数据单元。

7.

发明申请
METHOD AND SYSTEM FOR OFFLOADING COMPUTATION FLEXIBLY TO A COMMUNICATION ADAPTER 有权
标题翻译：将通信适配器灵活运算的方法和系统

公开(公告)号：US20130007181A1

公开(公告)日：2013-01-03

申请号：US13173473

申请日：2011-06-30

申请人： Rabin A. Sugumar , David Brower

发明人： Rabin A. Sugumar , David Brower

IPC分类号： G06F15/167

CPC分类号： G06F9/5027 , G06F2209/509

摘要： A method for offloading computation flexibly to a communication adapter includes receiving a message that includes a procedure image identifier associated with a procedure image of a host application, determining a procedure image and a communication adapter processor using the procedure image identifier, and forwarding the first message to the communication adapter processor configured to execute the procedure image. The method further includes executing, on the communication adapter processor independent of a host processor, the procedure image in communication adapter memory by acquiring a host memory latch for a memory block in host memory, reading the memory block in the host memory after acquiring the host memory latch, manipulating, by executing the procedure image, the memory block in the communication adapter memory to obtain a modified memory block, committing the modified memory block to the host memory, and releasing the host memory latch.

摘要翻译： 一种用于将计算灵活地卸载到通信适配器的方法包括接收包括与主机应用程序的过程映像相关联的过程映像标识符的消息，使用过程映像标识符确定过程映像和通信适配器处理器，以及转发第一消息配置为执行过程映像的通信适配器处理器。该方法还包括通过获取主机存储器中的存储器块的主机存储器锁存器来在独立于主处理器的通信适配器处理器上执行通信适配器存储器中的过程映像，在获取主机之后读取主机存储器中的存储器块存储器锁存器，通过执行过程映像来操纵通信适配器存储器中的存储块，以获得修改的存储器块，将修改的存储器块提交到主机存储器，以及释放主机存储器锁存器。

8.

发明申请
Scalable Interface for Connecting Multiple Computer Systems Which Performs Parallel MPI Header Matching 有权
标题翻译：用于连接执行并行MPI头匹配的多个计算机系统的可扩展接口

公开(公告)号：US20120243542A1

公开(公告)日：2012-09-27

申请号：US13489496

申请日：2012-06-06

申请人： Rabin A. Sugumar , Lars Paul Huse , Bjørn Dag Johnsen

发明人： Rabin A. Sugumar , Lars Paul Huse , Bjørn Dag Johnsen

IPC分类号： H04L12/56

CPC分类号： G06F15/17337

摘要： An interface device for a compute node in a computer cluster which performs Message Passing Interface (MPI) header matching using parallel matching units. The interface device comprises a memory that stores posted receive queues and unexpected queues. The posted receive queues store receive requests from a process executing on the compute node. The unexpected queues store headers of send requests (e.g., from other compute nodes) that do not have a matching receive request in the posted receive queues. The interface device also comprises a plurality of hardware pipelined matcher units. The matcher units perform header matching to determine if a header in the send request matches any headers in any of the plurality of posted receive queues. Matcher units perform the header matching in parallel. In other words, the plural matching units are configured to search the memory concurrently to perform header matching.

摘要翻译： 用于计算机集群中的计算节点的接口设备，其使用并行匹配单元执行消息传递接口（MPI）报头匹配。接口设备包括存储发布的接收队列和意外队列的存储器。发布的接收队列存储在计算节点上执行的进程的接收请求。意外队列存储在发布的接收队列中不具有匹配的接收请求的发送请求（例如来自其他计算节点）的头部。接口设备还包括多个硬件流水线匹配器单元。匹配器单元执行报头匹配以确定发送请求中的报头是否匹配多个发布的接收队列中的任何一个中的任何报头。匹配器单元并行执行头匹配。换句话说，多个匹配单元被配置为同时搜索存储器以执行头匹配。

9.

发明授权
Caching data in a cluster computing system which avoids false-sharing conflicts 有权
标题翻译：在集群计算系统中缓存数据，避免虚假共享冲突

公开(公告)号：US08095617B2

公开(公告)日：2012-01-10

申请号：US12495635

申请日：2009-06-30

申请人： Bjørn Dag Johnsen , Rabin A. Sugumar , Ben Sum , Lars Paul Huse

发明人： Bjørn Dag Johnsen , Rabin A. Sugumar , Ben Sum , Lars Paul Huse

IPC分类号： G06F15/16

CPC分类号： G06F12/0817 , G06F12/0813

摘要： Managing operations in a first compute node of a multi-computer system. A remote write may be received to a first address of a remote compute node. A first data structure entry may be created in a data structure, which may include the first address and status information indicating that the remote write has been received. Upon determining that the local cache of the first compute node has been updated with the remote write, the remote write may be issued to the remote compute node. Accordingly, the first data structure entry may be released upon completion of the remote write.

摘要翻译： 在多计算机系统的第一个计算节点中管理操作。远程写入可以被接收到远程计算节点的第一地址。可以在数据结构中创建第一数据结构条目，数据结构可以包括指示已经接收到远程写入的第一地址和状态信息。在确定使用远程写入更新了第一计算节点的本地高速缓存之后，可以向远程计算节点发出远程写入。因此，可以在完成远程写入时释放第一数据结构条目。

10.

发明申请
Software Aware Throttle Based Flow Control 有权
标题翻译：软件感知基于节气门的流量控制

公开(公告)号：US20100332676A1

公开(公告)日：2010-12-30

申请号：US12495452

申请日：2009-06-30

申请人： Rabin A. Sugumar , Bjørn Dag Johnsen , Lars Paul Huse , William M. Ortega

发明人： Rabin A. Sugumar , Bjørn Dag Johnsen , Lars Paul Huse , William M. Ortega

IPC分类号： G06F15/16

CPC分类号： H04L41/065 , H04L47/10 , H04L47/26 , H04L47/283 , H04L47/30 , H04L49/00 , H04L49/90

摘要： A system, comprising a compute node and coupled network adapter (NA), that supports improved data transfer request buffering and a more efficient method of determining the completion status of data transfer requests. Transfer requests received by the NA are stored in a first buffer then transmitted on a network interface. When significant network delays are detected and the first buffer is full, the NA sets a flag to stop software issuing transfer requests. Compliant software checks this flag before sending requests and does not issue further requests. A second NA buffer stores additional received transfer requests that were perhaps in-transit. When conditions improve the flag is cleared and the first buffer used again. Completion status is efficiently determined by grouping network transfer requests. The NA counts received requests and completed network requests for each group. Software determines if a group of requests is complete by reading a count value.

摘要翻译： 一种包括计算节点和耦合网络适配器（NA）的系统，其支持改进的数据传输请求缓冲以及确定数据传输请求的完成状态的更有效的方法。由NA接收的传送请求存储在第一缓冲器中，然后在网络接口上发送。当检测到显着的网络延迟并且第一个缓冲区已满时，NA设置一个标志，以停止发布传输请求的软件。合规软件在发送请求之前检查此标志，并且不会发出进一步的请求。第二个NA缓冲存储器可以存储可能在运输过程中的其他接收的传输请求。当条件改善时，标志被清除，第一个缓冲区再次使用。通过分组网络传输请求有效地确定完成状态。 NA计数接收到的请求并为每个组完成网络请求。软件通过读取计数值来确定一组请求是否完成。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类