专利检索 ap:("Jesus Corbal San Adrian" OR "Roger Espasa Sans" OR "Robert C. Valentine" OR "Santiago Galan Duran" OR "Jeffrey G. Wiedemeier" OR "Sridhar Samudrala" OR "Milind Baburao Girkar" OR "Andrew Thomas Forsyth" OR "Victor W. Lee") AND inv:"Sridhar Samudrala" 第 3 页

21.

发明申请
FUNCTIONAL UNIT FOR VECTOR INTEGER MULTIPLY ADD INSTRUCTION 有权
标题翻译：矢量整数多项式指令的功能单元

公开(公告)号：US20120078992A1

公开(公告)日：2012-03-29

申请号：US12890497

申请日：2010-09-24

申请人： Jeff Wiedemeier , Sridhar Samudrala , Roger Golliver

发明人： Jeff Wiedemeier , Sridhar Samudrala , Roger Golliver

IPC分类号： G06F17/16 , G06F7/496 , G06F7/494 , G06F7/485 , G06F7/487

CPC分类号： G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30018 , G06F9/30036

摘要： A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.

摘要翻译： 描述了在半导体芯片上实现的用于执行尺寸N的矢量操作的矢量功能单元。矢量功能单元包括N个功能单元。 N个功能单元中的每个具有执行以下操作的逻辑电路：呈现最高有序位而不是第一整数乘法加法运算的最低有序位的第一整数乘法加法，以及呈现最低有序位的第二整数乘法加法指令不是最高有序位的第二个整数乘法加法运算。

22.

发明授权
Method and apparatus for rounding floating point results in a digital processing system 有权
标题翻译：用于舍入浮点的方法和装置导致数字处理系统

公开(公告)号：US06366942B1

公开(公告)日：2002-04-02

申请号：US09281501

申请日：1999-03-30

申请人： Roy W. Badeau , William Robert Grundmann , Mark D. Matson , Sridhar Samudrala

发明人： Roy W. Badeau , William Robert Grundmann , Mark D. Matson , Sridhar Samudrala

IPC分类号： G06F750

CPC分类号： G06F7/49947 , G06F7/485

摘要： A method and apparatus for operating on floating point numbers is provided that accepts two floating point numbers as operands in order to perform addition, a rounding adder circuit is provided which can accept the operands and a rounding increment bit at various bit positions. The circuit uses full adders at required bit positions to accommodate a bit from each operand and the rounding bit. Since the proper position in which the rounding bit should be injected into the addition may be unknown at the start, respective low and high increment bit addition circuits are provided to compute a result for both a low and a high increment rounding bit condition. The final result is selected based upon the most significant bit of the low rounding bit increment result. In this manner, the present rounding adder circuit eliminates the need to perform a no increment calculation used to select a result, as in the prior art. Through the use of full adders, the circuit not only accounts for the round increment bit, but can accept increment bits at any bit position to perform operations such as two's complement, thus further reducing the operations required to perform a desired floating point mathematical operation.

摘要翻译： 提供了一种用于对浮点数进行操作的方法和装置，其接受两个浮点数作为操作数，以便执行加法，提供了可以接受操作数的舍入加法器电路和各种位位置处的舍入增量位。该电路在所需的位位置使用完全加法器，以适应每个操作数和舍入位的位。由于在开始时将注入舍入位的适当位置可能是未知的，所以提供相应的低和高增量位加法电路来计算低和高增量舍入比特条件的结果。最终结果是根据低舍入位增量结果的最高有效位来选择的。以这种方式，现有的舍入加法器电路消除了如现有技术中那样执行用于选择结果的无增量计算的需要。通过使用全加法器，该电路不仅考虑了循环增量位，而且可以在任何位位置接受增量位，以执行诸如二进制补码的操作，从而进一步减少执行所需浮点数学运算所需的操作。

23.

发明授权
Apparatus and method for execution of floating point operations 失效
标题翻译：用于执行浮点运算的装置和方法

公开(公告)号：US4849923A

公开(公告)日：1989-07-18

申请号：US879337

申请日：1986-06-27

申请人： Sridhar Samudrala , Victor Peng , Nachum M. Gavrielov

发明人： Sridhar Samudrala , Victor Peng , Nachum M. Gavrielov

IPC分类号： G06F7/57

CPC分类号： G06F7/483 , G06F7/49915 , G06F7/49936 , G06F7/49947

摘要： In a floating point arithmetic execution unit, an additional adder unit and a selection network are added to the apparatus typically performing the arithmetic floating point function. The additional apparatus permits certain processes forming part of arithmetic operations to be executed in parallel. For selected arithmetic operations, the final result can be one of two values typically related by an intermediate shifting operation. By performing the processes in parallel and selecting the appropriate result, the execution time can be reduced when compared to the execution of the process in a serial implementation. The fundamental arithmetic operations of addition, subtraction, multiplication and division can each have the execution time decreased using the disclosed additional apparatus.

摘要翻译： 在浮点算术执行单元中，附加加法器单元和选择网络被添加到通常执行算术浮点函数的装置中。附加装置允许并行执行形成算术运算的一部分的某些过程。对于所选的算术运算，最终结果可以是通常与中间移位操作相关的两个值之一。通过并行执行处理并选择适当的结果，与串行实现中的处理的执行相比，可以减少执行时间。使用所公开的附加装置，加法，减法，乘法和除法的基本算术运算各自可以使执行时间减少。

24.

发明公开
I/O ACCELERATION IN A MULTI-NODE ARCHITECTURE 审中-公开

公开(公告)号：US20240126622A1

公开(公告)日：2024-04-18

申请号：US18397590

申请日：2023-12-27

申请人： Anil Vasudevan , Sridhar Samudrala , Tushar S. Gohad , Nash A. Kleppan , Stefan T. Peters

发明人： Anil Vasudevan , Sridhar Samudrala , Tushar S. Gohad , Nash A. Kleppan , Stefan T. Peters

IPC分类号： G06F9/54 , G06F13/16

CPC分类号： G06F9/542 , G06F13/1668

摘要： A set of threads of an application are identified to be executed on a platform, where the platform comprises a multi-node architecture. A set of queues of an I/O device of the platform are reserved and associated with one of a plurality of nodes in the multi-node architecture. Data is received at the I/O device, where the I/O device is included in a particular one of the plurality of nodes. Response data is generated through execution of a thread in the set of threads using a processing core and memory of the particular node, and the response data is caused to be sent on the I/O device based on inclusion of the I/O device in the particular node.

25.

发明申请
MECHANISM FOR FACILITATING DYNAMIC AND EFFICIENT FUSION OF COMPUTING INSTRUCTIONS IN SOFTWARE PROGRAMS 有权
标题翻译：促进软件程序中计算机指令动态和有效融合的机制

公开(公告)号：US20150026671A1

公开(公告)日：2015-01-22

申请号：US14129956

申请日：2013-03-27

申请人： Marc Lupon , Raul Martinez , Enric Gibert Codina , Kyriakos A. Stavrou , Grigorios Magklis , Sridhar Samudrala

发明人： Marc Lupon , Raul Martinez , Enric Gibert Codina , Kyriakos A. Stavrou , Grigorios Magklis , Sridhar Samudrala

IPC分类号： G06F9/45

CPC分类号： G06F8/443 , G06F8/4432 , G06F8/4434 , G06F8/4441 , Y02D10/41

摘要： A mechanism is described for facilitating dynamic and efficient fusion of computing instructions according to one embodiment. A method of embodiments, as described herein, includes monitoring a software program for a program region having fusion candidate instructions for a fusion operation at a computing system; evaluating whether the macro operation of the candidate instructions is valuable to the software program; and performing the fusion operation if it is evaluated to be valuable.

摘要翻译： 描述了根据一个实施例的用于促进计算指令的动态和有效融合的机制。如本文所述的实施例的方法包括监视具有用于在计算系统处的融合操作的融合候选指令的程序区域的软件程序; 评估候选指令的宏操作是否对软件程序有价值; 如果评估为有价值，则进行融合操作。

26.

发明申请
INSTRUCTION AND LOGIC TO PROVIDE VECTOR BLEND AND PERMUTE FUNCTIONALITY 审中-公开
标题翻译：指令和逻辑提供向量混合和绝对功能

公开(公告)号：US20140372727A1

公开(公告)日：2014-12-18

申请号：US13977734

申请日：2011-12-23

申请人： Robert Valentine , Bret L. Toll , Jesus Corbal , Jeff G. Wiedemeier , Sridhar Samudrala

发明人： Robert Valentine , Bret L. Toll , Jesus Corbal , Jeff G. Wiedemeier , Sridhar Samudrala

IPC分类号： G06F9/30 , G06F9/38

CPC分类号： G06F9/30036 , G06F9/3001 , G06F9/30018 , G06F9/30032 , G06F9/3887

摘要： Vector blend and permute functionality are provided, responsive to instructions specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, a second vector register, and a third operand. Indices are read from fields in the second register. Each index has a first selector portion and a second selector portion. Corresponding unmasked vector elements are stored to fields of the destination register, wherein each vector element, responsive to the respective first selector portion having a first value, is copied to an intermediate vector from a corresponding data field of the first register, and responsive to the respective first selector portion having a second value, is copied to the intermediate vector from a corresponding data field of the third operand. Then unmasked data fields of the destination are replaced by data fields in the intermediate vector indexed by the corresponding second selector portions.

摘要翻译： 提供向量混合和置换功能，响应于指令：包括存储向量元素的字段的目的地向量寄存器，第一向量寄存器，向量元素大小，第二向量寄存器和第三操作数。指数从第二个寄存器中的字段读取。每个索引具有第一选择器部分和第二选择器部分。对应的未屏蔽向量元素被存储到目的地寄存器的字段，其中响应于具有第一值的相应第一选择器部分的每个向量元素从第一寄存器的对应数据字段被复制到中间向量，并且响应于具有第二值的相应的第一选择器部分从第三操作数的相应数据字段复制到中间向量。然后，由对应的第二选择器部分索引的中间向量中的数据字段替换目的地的未屏蔽的数据字段。

27.

发明授权
Functional unit for vector integer multiply add instruction 有权
标题翻译：矢量整数乘法加法指令的功能单位

公开(公告)号：US08667042B2

公开(公告)日：2014-03-04

申请号：US12890497

申请日：2010-09-24

申请人： Jeff Wiedemeier , Sridhar Samudrala , Roger Golliver

发明人： Jeff Wiedemeier , Sridhar Samudrala , Roger Golliver

IPC分类号： G06F7/38 , G06F15/00 , G06F15/76

CPC分类号： G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30018 , G06F9/30036

摘要： A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.

摘要翻译： 描述了在半导体芯片上实现的用于执行尺寸N的矢量操作的矢量功能单元。矢量功能单元包括N个功能单元。 N个功能单元中的每个具有执行以下操作的逻辑电路：呈现最高有序位而不是第一整数乘法加法运算的最低有序位的第一整数乘法加法，以及呈现最低有序位的第二整数乘法加法指令不是最高有序位的第二个整数乘法加法运算。

28.

发明申请
METHOD AND APPARATUS FOR CONTROLLING A MXCSR 审中-公开
标题翻译：用于控制MXCSR的方法和装置

公开(公告)号：US20130326199A1

公开(公告)日：2013-12-05

申请号：US13995416

申请日：2011-12-29

申请人： Grigorios Magklis , Josep M. Codina , Craig B. Zilles , Michael Neilly , Sridhar Samudrala , Alejandro Martinez Vicente , Polychronis Xekalakis , F. Jesus Sanchez , Marc Lupon , Georgios Tournavitis , Enric Gibert Codina , Crispin Gomez Requena , Antonio Gonzalez , Mirem Hyuseinova , Christos E. Kotselidis , Fernando Latorre , Pedro Lopez , Carlos Madriles Gimeno , Pedro Marcuello , Raul Martinez , Daniel Ortega , Demos Pavlou , Kyriakos A. Stavrou

发明人： Grigorios Magklis , Josep M. Codina , Craig B. Zilles , Michael Neilly , Sridhar Samudrala , Alejandro Martinez Vicente , Polychronis Xekalakis , F. Jesus Sanchez , Marc Lupon , Georgios Tournavitis , Enric Gibert Codina , Crispin Gomez Requena , Antonio Gonzalez , Mirem Hyuseinova , Christos E. Kotselidis , Fernando Latorre , Pedro Lopez , Carlos Madriles Gimeno , Pedro Marcuello , Raul Martinez , Daniel Ortega , Demos Pavlou , Kyriakos A. Stavrou

IPC分类号： G06F9/30

CPC分类号： G06F9/3001 , G06F9/30032 , G06F9/30087 , G06F9/30094 , G06F9/30101 , G06F9/3842

摘要： Disclosed is an apparatus and method generally related to controlling a multimedia extension control and status register (MXCSR). A processor core may include a floating point unit (FPU) to perform arithmetic functions; and a multimedia extension control register (MXCR) to provide control bits to the FPU. Further an optimizer may be used to select a speculative multimedia extension status register (SPEC_MXSR) from a plurality of SPEC_MXSRs to update a multimedia extension status register (MXSR) based upon an instruction.

摘要翻译： 公开了一种通常涉及控制多媒体扩展控制和状态寄存器（MXCSR）的装置和方法。处理器核心可以包括用于执行算术功能的浮点单元（FPU）以及为FPU提供控制位的多媒体扩展控制寄存器（MXCR）。此外，优化器可以用于从多个SPEC_MXSR中选择推测性多媒体扩展状态寄存器（SPEC_MXSR），以基于指令来更新多媒体扩展状态寄存器（MXSR）。

29.

发明授权
Method and apparatus for accumulating partial quotients in a digital processor 有权
标题翻译：用于在数字处理器中累积部分商的方法和装置

公开(公告)号：US06732135B1

公开(公告)日：2004-05-04

申请号：US09494593

申请日：2000-01-31

申请人： Sridhar Samudrala , John D. Clouser , William R. Grundmann

发明人： Sridhar Samudrala , John D. Clouser , William R. Grundmann

IPC分类号： G06F752

CPC分类号： G06F7/535 , G06F2207/5355

摘要： In a digital processor performing division, quotient accumulation apparatus is formed of a set of muxes and a single carry save adder. Partial quotients are accumulated in carry-save form with proper sign extension. Delay of partial quotient bit fragments from one iteration to a following iteration enables the apparatus to limit use to one carry save adder. By enlarging minimal logic, the quotient accumulation apparatus operates at a rate fast enough to support the rate of fast dividers.

摘要翻译： 在执行分割的数字处理器中，商积累装置由一组多路复用器和一个进位存储加法器构成。部分商以携带保存形式累积，具有适当的符号扩展。从一次迭代到后续迭代的部分商位片段的延迟使得装置能够将使用限制到一个进位存储加法器。通过放大最小逻辑，商积累装置以足够快的速度运行以支持快速分频器的速率。

30.

发明授权
Leading one/zero bit detector for floating point operation 失效
标题翻译：引导一位/零位检测器进行浮点运算

公开(公告)号：US5317527A

公开(公告)日：1994-05-31

申请号：US16054

申请日：1993-02-10

申请人： Sharon M. Britton , Randy Allmon , Sridhar Samudrala

发明人： Sharon M. Britton , Randy Allmon , Sridhar Samudrala

IPC分类号： G06F7/74 , G06F7/00 , G06F7/38

CPC分类号： G06F7/74

摘要： A circuit is provided for using the input operands of a floating point addition or subtraction operation to detect the leading one or zero bit position in parallel with the arithmetic operation. This allows the alignment to be performed on the available result in the next cycle of the floating point operation and results in a significant performance advantage. The leading I/O detection is decoupled from the adder that is computing the result in parallel, eliminating the need for special circuitry to compute a carry-dependent adjustment signal. The single-bit fraction overflow that can result from leading I/O misprediction is corrected with existing circuitry during a later stage of computation.

摘要翻译： 提供了一种电路，用于使用浮点加法或减法运算的输入操作数来检测与算术运算并行的前一个或零位位置。这允许在浮点运算的下一个循环中对可用结果执行对准，并且产生显着的性能优势。领先的I / O检测与并行计算结果的加法器去耦，消除了对专用电路计算进位相关调整信号的需要。在以后的计算阶段，由现有的电路校正可能由引导的I / O误预测导致的单位分数溢出。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类