Method and apparatus to process 4-operand SIMD integer multiply-accumulate instruction
    44.
    发明授权
    Method and apparatus to process 4-operand SIMD integer multiply-accumulate instruction 有权
    处理4操作数SIMD整数乘法累加指令的方法和装置

    公开(公告)号:US09292297B2

    公开(公告)日:2016-03-22

    申请号:US13617021

    申请日:2012-09-14

    IPC分类号: G06F9/00 G06F9/38 G06F9/30

    摘要: According to one embodiment, a processor includes an instruction decoder to receive an instruction to process a multiply-accumulate operation, the instruction having a first operand, a second operand, a third operand, and a fourth operand. The first operand is to specify a first storage location to store an accumulated value; the second operand is to specify a second storage location to store a first value and a second value; and the third operand is to specify a third storage location to store a third value. The processor further includes an execution unit coupled to the instruction decoder to perform the multiply-accumulate operation to multiply the first value with the second value to generate a multiply result and to accumulate the multiply result and at least a portion of a third value to an accumulated value based on the fourth operand.

    摘要翻译: 根据一个实施例,处理器包括指令解码器,用于接收处理多重累积运算的指令,该指令具有第一操作数,第二操作数,第三操作数和第四操作数。 第一个操作数是指定一个存储累积值的第一个存储位置; 第二操作数是指定存储第一值和第二值的第二存储位置; 并且第三操作数是指定存储第三值的第三存储位置。 所述处理器还包括执行单元,其耦合到所述指令解码器以执行所述乘法运算,以将所述第一值乘以所述第二值以产生乘法结果,并将乘法结果和第三值的至少一部分累积到 基于第四操作数的累计值。

    Enhancing performance by instruction interleaving and/or concurrent processing of multiple buffers
    46.
    发明授权
    Enhancing performance by instruction interleaving and/or concurrent processing of multiple buffers 有权
    通过多个缓冲区的指令交织和/或并发处理来提高性能

    公开(公告)号:US08930681B2

    公开(公告)日:2015-01-06

    申请号:US12963298

    申请日:2010-12-08

    IPC分类号: G06F9/38 G06F9/30 G06F9/48

    摘要: An embodiment may include circuitry to execute, at least in part, a first list of instructions and/or to concurrently process, at least in part, first and second buffers. The execution of the first list of instructions may result, at least in part, from invocation of a first function call. The first list of instructions may include at least one portion of a second list of instructions interleaved, at least in part, with at least one other portion of a third list of instructions. The portions may be concurrently carried out, at least in part, by one or more sets of execution units of the circuitry. The second and third lists of instructions may implement, at least in part, respective algorithms that are amenable to being invoked by separate respective function calls. The concurrent processing may involve, at least in part, complementary algorithms.

    摘要翻译: 实施例可以包括至少部分地执行第一指令列表和/或至少部分地执行第一和第二缓冲器的电路。 第一指令列表的执行可以至少部分地由第一函数调用的调用产生。 第一指令列表可以包括至少部分地与第三指令列表的至少一个其他部分交织的第二指令列表的至少一部分。 这些部分可以至少部分地由电路的一个或多个执行单元同时执行。 第二和第三指令列表可以至少部分地实现适合于通过单独的各自的功能调用来调用的相应算法。 并行处理可以至少部分地涉及互补算法。