Execution unit with inline pseudorandom number generator
    11.
    发明授权
    Execution unit with inline pseudorandom number generator 失效
    具有内联伪随机数发生器的执行单元

    公开(公告)号:US08255443B2

    公开(公告)日:2012-08-28

    申请号:US12132115

    申请日:2008-06-03

    IPC分类号: G06F7/58

    摘要: A circuit arrangement and method couple a hardware-based pseudorandom number generator (PRNG) to an execution unit in such a manner that pseudorandom numbers generated by the PRNG may be selectively output to the execution unit for use as an operand during the execution of instructions by the execution unit. A PRNG may be coupled to an input of an operand multiplexer that outputs to an operand input of an execution unit so that operands provided by instructions supplied to the execution unit are selectively overridden with pseudorandom numbers generated by the PRNG. Furthermore, overridden operands provided by instructions supplied to the execution unit may be used as seed values for the PRNG. In many instances, an instruction executed by an execution unit may be able to perform an arithmetic operation using both an operand specified by the instruction and a pseudorandom number generated by the PRNG during the execution of the instruction, so that the generation of the pseudorandom number and the performance of the arithmetic operation occur during a single pass of an execution unit.

    摘要翻译: 电路布置和方法将基于硬件的伪随机数生成器(PRNG)耦合到执行单元,使得由PRNG生成的伪随机数可以被选择性地输出到执行单元,以在执行指令期间用作操作数, 执行单元。 PRNG可以耦合到操作数多路复用器的输入,该输入输出到执行单元的操作数输入,使得由提供给执行单元的指令提供的操作数被PRNG生成的伪随机数选择性地覆盖。 此外,提供给执行单元的指令提供的覆盖操作数可以用作PRNG的种子值。 在许多情况下,执行单元执行的指令可以在执行指令期间使用由指令指定的操作数和由PRNG生成的伪随机数来执行算术运算,从而生成伪随机数 并且算术运算的执行在执行单元的单次通过期间发生。

    Designating operands with fewer bits in instruction code by indexing into destination register history table for each thread
    12.
    发明授权
    Designating operands with fewer bits in instruction code by indexing into destination register history table for each thread 失效
    通过索引到每个线程的目标寄存器历史记录表来指定指令代码中较少位的操作数

    公开(公告)号:US07814299B2

    公开(公告)日:2010-10-12

    申请号:US12274560

    申请日:2008-11-20

    IPC分类号: G06F9/30

    摘要: A circuit arrangement and method support instruction target history based register address indexing, whereby register addresses to be used by an instruction are decoded using a target history table of previous target register addresses, and an index into the target history table supplied by an index value in the instruction. An instruction may include at least one index value that identifies a previously used register address. During execution of the instruction, the index is retrieved from the instruction, and then a register address is retrieved from the target history table using the index.

    摘要翻译: 一种电路布置和方法支持指令目标历史的寄存器地址索引,由此由指令使用的寄存器地址使用先前目标寄存器地址的目标历史表和由目标历史表中的索引值提供的索引进行解码 指示。 指令可以包括标识先前使用的寄存器地址的至少一个索引值。 在执行指令期间,从指令中检索索引,然后使用索引从目标历史表中检索一个寄存器地址。

    Method and Apparatus for an Area Efficient Transcendental Estimate Algorithm
    14.
    发明申请
    Method and Apparatus for an Area Efficient Transcendental Estimate Algorithm 失效
    用于区域有效超验估计算法的方法和装置

    公开(公告)号:US20090070398A1

    公开(公告)日:2009-03-12

    申请号:US11851658

    申请日:2007-09-07

    IPC分类号: G06F7/38

    CPC分类号: G06F7/548

    摘要: A method, computer-readable medium, and an apparatus for generating a transcendental value. The method includes receiving an input containing an input value and an opcode and determining whether the opcode corresponds to a trigonometric operation or a power-of-two operation. The method also includes calculating a fractional value and an integer value from the input value, generating the transcendental value based on the fractional value by adding at least a portion of the fractional value with at least one of a shifted fractional value produced by shifting the portion of the fractional value and a constant value, and providing the transcendental value in response to the request. In this fashion, the same circuit area may be used to carry out both trigonometric and power-of-two calculations, leading to greater circuit area savings and performance advantages while not sacrificing significant accuracy.

    摘要翻译: 一种用于产生超验值的方法,计算机可读介质和装置。 该方法包括接收包含输入值和操作码的输入,并确定操作码是否对应于三角运算或二进制运算。 该方法还包括从输入值计算分数值和整数值,通过将分数值的至少一部分与通过移动部分产生的移位分数值中的至少一个相加而基于分数值生成超越值 的分数值和恒定值,并且响应于该请求提供超验值。 以这种方式,可以使用相同的电路面积来执行三角和二次幂计算,导致更大的电路面积节省和性能优点,而不牺牲显着的精度。

    Operand Multiplexor Control Modifier Instruction in a Fine Grain Multithreaded Vector Microprocessor
    15.
    发明申请
    Operand Multiplexor Control Modifier Instruction in a Fine Grain Multithreaded Vector Microprocessor 失效
    精细多线程向量微处理器中的操作数多路复用器控制修改器指令

    公开(公告)号:US20080122854A1

    公开(公告)日:2008-05-29

    申请号:US11564072

    申请日:2006-11-28

    IPC分类号: G06T1/00

    CPC分类号: G06T1/20

    摘要: The present invention is generally related to the field of image processing, and more specifically to an instruction set for processing images. Vector processing may involve rearranging vector operands in one or more source registers prior to performing vector operations. Typically, rearranging of operands in source registers is done by issuing a plurality of permute instructions that require excessive usage of temporary registers. Furthermore, the permute instructions may cause dependencies between instructions executing in a pipeline, thereby adversely affecting performance. Embodiments of the invention provide a level of muxing between a register file and a vector unit that allow for rearrangement of vector operands in source registers prior to providing the operands to the vector unit, thereby obviating the need for permute instructions.

    摘要翻译: 本发明通常涉及图像处理领域,更具体地涉及用于处理图像的指令集。 矢量处理可以包括在执行向量操作之前在一个或多个源寄存器中重新排列向量操作数。 通常,通过发出需要临时寄存器过度使用的多个置换指令来完成源寄存器中操作数的重新排列。 此外,置换指令可能导致在流水线中执行的指令之间的相关性,从而不利地影响性能。 本发明的实施例提供了一种在寄存器文件和向量单元之间的复用水平,其允许在将操作数提供给向量单元之前重新排列源寄存器中的向量操作数,从而避免了对置换指令的需要。

    Method and apparatus for implementing a multiple operand vector floating point summation to scalar function
    16.
    发明授权
    Method and apparatus for implementing a multiple operand vector floating point summation to scalar function 失效
    用于实现多重操作数向量浮点求和的标量函数的方法和装置

    公开(公告)号:US08239438B2

    公开(公告)日:2012-08-07

    申请号:US11840277

    申请日:2007-08-17

    IPC分类号: G06F7/38

    摘要: Embodiments of the invention provide methods and apparatus for executing a multiple operand instruction. Executing the multiple operand instruction comprises computing an arithmetic result of a pair of operands in each processing lane of a vector unit. The arithmetic results generated in each processing lane of the vector unit may be transferred to a dot product unit. The dot product unit may compute an arithmetic result using the arithmetic result computed by each processing lane of the vector unit to generate an arithmetic result of more than two operands.

    摘要翻译: 本发明的实施例提供了用于执行多操作数指令的方法和装置。 执行多操作数指令包括​​计算向量单元的每个处理通道中的一对操作数的算术结果。 在矢量单元的每个处理车道中产生的算术结果可以被转移到点积单位。 点积单位可以使用由向量单位的每个处理车道计算的算术结果来计算算术结果,以生成超过两个操作数的算术结果。

    Processing unit incorporating instruction-based persistent vector multiplexer control
    19.
    发明授权
    Processing unit incorporating instruction-based persistent vector multiplexer control 失效
    包含基于指令的持久矢量多路复用器控制的处理单元

    公开(公告)号:US07904699B2

    公开(公告)日:2011-03-08

    申请号:US12045221

    申请日:2008-03-10

    IPC分类号: G06F9/00

    摘要: Persistent vector multiplexer control is used in a vector-based execution unit to control the shuffling of words in operand vectors processed by the execution unit. In addition, a persistent swizzle instruction is defined in an instruction set for the vector-based execution unit and is used to cause state information to be persisted such that the operand vectors processed by subsequent vector instructions executed by the vector-based execution unit will be selectively shuffled using the persisted state information. As a result, when multiple vector instructions require a common custom word ordering for one or more operand vectors, a single persistent swizzle instruction may be used to select the desired custom word ordering for all of the vector instructions.

    摘要翻译: 持续矢量复用器控制在基于矢量的执行单元中用于控制由执行单元处理的操作数向量中的字的混洗。 此外,在用于基于向量的执行单元的指令集中定义持续转换指令,并且用于使状态信息被持久化,使得由基于向​​量的执行单元执行的后续向量指令处理的操作数向量将被 使用持久状态信息选择性地进行混洗。 因此,当多个向量指令需要一个或多个操作数向量的公共自定义单词排序时,可以使用单个持续旋转指令来选择所有向量指令的期望的定制单词排序。

    Operand multiplexor control modifier instruction in a fine grain multithreaded vector microprocessor
    20.
    发明授权
    Operand multiplexor control modifier instruction in a fine grain multithreaded vector microprocessor 失效
    精细多线程向量微处理器中的操作数多路复用器控制修改器指令

    公开(公告)号:US07868894B2

    公开(公告)日:2011-01-11

    申请号:US11564072

    申请日:2006-11-28

    IPC分类号: G06T1/00

    CPC分类号: G06T1/20

    摘要: The present invention is generally related to the field of image processing, and more specifically to an instruction set for processing images. Vector processing may involve rearranging vector operands in one or more source registers prior to performing vector operations. Typically, rearranging of operands in source registers is done by issuing a plurality of permute instructions that require excessive usage of temporary registers. Furthermore, the permute instructions may cause dependencies between instructions executing in a pipeline, thereby adversely affecting performance. Embodiments of the invention provide a level of muxing between a register file and a vector unit that allow for rearrangement of vector operands in source registers prior to providing the operands to the vector unit, thereby obviating the need for permute instructions.

    摘要翻译: 本发明通常涉及图像处理领域,更具体地涉及用于处理图像的指令集。 矢量处理可以包括在执行向量操作之前在一个或多个源寄存器中重新排列向量操作数。 通常,通过发出需要临时寄存器过度使用的多个置换指令来完成源寄存器中操作数的重新排列。 此外,置换指令可能导致在流水线中执行的指令之间的相关性,从而不利地影响性能。 本发明的实施例提供了一种在寄存器文件和向量单元之间的复用水平,其允许在将操作数提供给向量单元之前重新排列源寄存器中的向量操作数,从而避免了对置换指令的需要。