Processor and method for implementing instruction support for multiplication of large operands
    41.
    发明授权
    Processor and method for implementing instruction support for multiplication of large operands 有权
    用于实现大操作数乘法的指令支持的处理器和方法

    公开(公告)号:US08438208B2

    公开(公告)日:2013-05-07

    申请号:US12488372

    申请日:2009-06-19

    IPC分类号: G06F7/52 G06F7/38

    CPC分类号: G06F7/4876 G06F2207/382

    摘要: A processor including instruction support for implementing large-operand multiplication may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include an instruction execution unit comprising a hardware multiplier datapath circuit, where the hardware multiplier datapath circuit is configured to multiply operands having a maximum number of bits M. In response to receiving a single instance of a large-operand multiplication instruction defined within the ISA, wherein at least one of the operands of the large-operand multiplication instruction includes more than the maximum number of bits M, the instruction execution unit is configured to multiply operands of the large-operand multiplication instruction within the hardware multiplier datapath circuit to determine a result of the large-operand multiplication instruction without execution of programmer-selected instructions within the ISA other than the large-operand multiplication instruction.

    摘要翻译: 包括用于实现大操作数乘法的指令支持的处理器可以从定义的指令集架构(ISA)发出用于执行编程器可选择指令的执行。 处理器可以包括指令执行单元,其包括硬件乘法器数据路径电路,其中硬件乘法器数据路径电路被配置为对具有最大位数M的操作数进行乘法。响应于接收到在其中定义的大操作数乘法指令的单个实例 所述ISA,其中所述大操作数乘法指令的操作数中的至少一个包括多于所述最大位数M,所述指令执行单元被配置为将所述大操作数乘法指令在所述硬件乘法器数据通路电路内的操作数乘以 确定大操作数乘法指令的结果,而不在大操作数乘法指令之外执行ISA内的编程器选择指令。

    PIPELINED DIVIDE CIRCUIT FOR SMALL OPERAND SIZES
    42.
    发明申请
    PIPELINED DIVIDE CIRCUIT FOR SMALL OPERAND SIZES 有权
    用于小型操作尺寸的管道分流电路

    公开(公告)号:US20120259907A1

    公开(公告)日:2012-10-11

    申请号:US13081991

    申请日:2011-04-07

    IPC分类号: G06F7/52

    CPC分类号: G06F7/535 G06F2207/3884

    摘要: A pipelined circuit for performing a divide operation on small operand sizes. The circuit includes a plurality of stages connected together in a series to perform a subtractive divide algorithm based on iterative subtractions and shifts. Each stage computes two quotient bits and outputs a partial remainder value to the next stage in the series. The first and last stages utilize a radix-4 serial architecture with edge modifications to increase efficiency. The intermediate stages utilize a radix-4 parallel architecture. The divide architecture is pipelined such that input operands can be applied to the divider on each clock cycle.

    摘要翻译: 一种用于对小操作数大小执行除法运算的流水线电路。 电路包括串联连接在一起的多个级,以执行基于迭代减法和偏移的减法除法算法。 每个阶段计算两个商位,并将一个部分余数值输出到系列中的下一个阶段。 第一个和最后一个阶段利用具有边缘修改的基数-4串行架构来提高效率。 中间阶段使用基4并行架构。 分频结构被流水线化,使得可以在每个时钟周期将输入操作数应用于分频器。

    SYSTEM AND METHOD OF BYPASSING UNROUNDED RESULTS IN A MULTIPLY-ADD PIPELINE UNIT
    43.
    发明申请
    SYSTEM AND METHOD OF BYPASSING UNROUNDED RESULTS IN A MULTIPLY-ADD PIPELINE UNIT 有权
    在多用途管道单元中排除未结果的系统和方法

    公开(公告)号:US20120233234A1

    公开(公告)日:2012-09-13

    申请号:US13043101

    申请日:2011-03-08

    IPC分类号: G06F7/00

    摘要: A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.

    摘要翻译: 一种用于在多重加法管线中执行乘法运算的处理单元,系统和方法。 为了减少流水线延迟,乘法运算的未包围结果被旁路到乘法加法管道的输入端,用于后续操作。 如果确定先前操作需要舍入,则在随后的操作期间将进行舍入。 在随后的操作期间,未被乘法运算使用的布斯编码器将输出舍入校正因子作为选择输入到未被乘法运算使用的布斯多路复用器。 当布斯多路复用器接收舍入校正因子时,布尔多路复用器将输出舍入校正值到进位保存加法器(CSA)树,并且CSA树将从舍入校正值和其他部分乘积生成正确的和。

    Handling multi-cycle integer operations for a multi-threaded processor
    44.
    发明授权
    Handling multi-cycle integer operations for a multi-threaded processor 有权
    处理多线程处理器的多循环整数运算

    公开(公告)号:US08195919B1

    公开(公告)日:2012-06-05

    申请号:US11927177

    申请日:2007-10-29

    IPC分类号: G06F13/00

    摘要: Determining an effective address of a memory with a three-operand add operation in single execution cycle of a multithreaded processor that can access both segmented memory and non-segmented memory. During that cycle, the processor determines whether a memory segment base is zero. If the segment base is zero, the processor can access a memory location at the effective address without adding the segment base. If the segment base is not zero, such as when executing legacy code, the processor consumes another cycle to add the segment base to the effective address. Similarly, the processor consumes another cycle if the effective address or the linear address is misaligned. An integer execution unit that performs the three-operand add using a carry-save adder coupled to a carry look-ahead adder. If the segment base is not zero, the effective address is fed back through the integer execution unit to add the segment base.

    摘要翻译: 在可以访问分段存储器和非分段存储器的多线程处理器的单个执行周期中确定具有三操作数添加操作的存储器的有效地址。 在该周期期间,处理器确定存储器段基数是否为零。 如果分段基数为零,则处理器可以在有效地址的情况下访问存储器位置,而不添加分段基。 如果段基数不为零,例如执行遗留代码时,处理器消耗另一个周期,将段基数添加到有效地址。 类似地,如果有效地址或线性地址不对齐,则处理器消耗另一个周期。 整数执行单元,其使用耦合到进位先行加法器的进位保存加法器来执行三运算加法。 如果段基数不为零,则通过整数执行单元反馈有效地址以添加段基。

    PROCESSOR AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR HASH ALGORITHMS
    45.
    发明申请
    PROCESSOR AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR HASH ALGORITHMS 有权
    用于执行哈希算法的指令支持的处理器和方法

    公开(公告)号:US20100250966A1

    公开(公告)日:2010-09-30

    申请号:US12415403

    申请日:2009-03-31

    IPC分类号: H04L9/28 G06F9/30 G06F9/312

    摘要: A processor including instruction support for implementing hash algorithms may issue, for execution, programmer-selectable hash instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include hash instructions defined within the ISA. In addition, the hash instructions may be executable by the cryptographic unit to implement a hash that is compliant with one or more respective hash algorithm specifications. In response to receiving a particular hash instruction defined within the ISA, the cryptographic unit may retrieve a set of input data blocks from a predetermined set of architectural registers of the processor, and generate a hash value of the set of input data blocks according to a hash algorithm that corresponds to the particular hash instruction.

    摘要翻译: 包括用于实现散列算法的指令支持的处理器可以从定义的指令集体系结构(ISA)发布执行编程器可选择的散列指令。 处理器可以包括可以接收执行指令的密码单元。 这些指令包括ISA内定义的散列指令。 此外,哈希指令可以由密码单元执行,以实现符合一个或多个相应散列算法规范的散列。 响应于接收在ISA内定义的特定散列指令,加密单元可以从处理器的预定的体系结构寄存器集中检索一组输入数据块,并且根据一个输入数据块生成一组输入数据块的哈希值 哈希算法对应于特定的哈希指令。

    APPARATUS AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR THE KASUMI CIPHER ALGORITHM
    46.
    发明申请
    APPARATUS AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR THE KASUMI CIPHER ALGORITHM 审中-公开
    用于实施KASUMI CIPHER算法的指导性支持的装置和方法

    公开(公告)号:US20100246815A1

    公开(公告)日:2010-09-30

    申请号:US12414871

    申请日:2009-03-31

    IPC分类号: H04L9/28 G06F9/30 G06F9/312

    摘要: A processor including instruction support for implementing the Kasumi block cipher algorithm may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include one or more Kasumi instructions defined within the ISA. In addition, the Kasumi instructions may be executable by the cryptographic unit to implement portions of a Kasumi cipher that is compliant with 3rd Generation Partnership Project (3GPP) Technical Specification TS 35.202 version 8.0.0. In response to receiving a Kasumi FL( )-operation instruction defined within the ISA, the cryptographic unit may perform an FL( ) operation, as defined by the Kasumi cipher, upon a data input operand and a subkey operand in which the data input operand and subkey operand may be specified by the Kasumi FL( )-operation instruction.

    摘要翻译: 包括用于实现Kasumi块密码算法的指令支持的处理器可以从定义的指令集体系结构(ISA)发出用于执行编程器可选择指令的执行。 处理器可以包括可以接收执行指令的密码单元。 说明包括在ISA内定义的一个或多个Kasumi指令。 此外,Kasumi指令可以由密码单元执行,以实现符合第三代合作伙伴计划(3GPP)技术规范TS 35.202版本8.0.0的Kasumi密码的部分。 响应于接收到在ISA内定义的Kasumi FL()操作指令,加密单元可以对数据输入操作数和数据输入操作数的子键操作数执行如Kasumi密码所定义的FL()操作 并且子键操作数可以由Kasumi FL()操作指令指定。

    APPARATUS AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR THE DATA ENCRYPTION STANDARD (DES) ALGORITHM
    47.
    发明申请
    APPARATUS AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR THE DATA ENCRYPTION STANDARD (DES) ALGORITHM 有权
    用于实施数据加密标准(DES)算法的指令支持的装置和方法

    公开(公告)号:US20100246814A1

    公开(公告)日:2010-09-30

    申请号:US12414755

    申请日:2009-03-31

    IPC分类号: H04L9/06

    摘要: A processor including instruction support for implementing the Data Encryption Standard (DES) block cipher algorithm may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include one or more DES instructions defined within the ISA. In addition, the DES instructions may be executable by the cryptographic unit to implement portions of an DES cipher that is compliant with Federal Information Processing Standards Publication 46-3 (FIPS 46-3). In response to receiving a DES key expansion instruction defined within the ISA, the cryptographic unit may generate one or more expanded cipher keys of the DES cipher key schedule from an input key.

    摘要翻译: 包括用于实现数据加密标准(DES)块密码算法的指令支持的处理器可以从定义的指令集体系结构(ISA)发出执行编程器可选择的指令。 处理器可以包括可以接收执行指令的密码单元。 指令包括在ISA内定义的一个或多个DES指令。 此外,DES指令可以由加密单元执行,以实现符合联邦信息处理标准出版物46-3(FIPS 46-3)的DES密码的部分。 响应于接收到在ISA内定义的DES密钥扩展指令,密码单元可以从输入密钥生成DES密码密钥调度的一个或多个扩展密码密钥。

    Apparatus and method for reducing execution latency of floating point operations having special case operands
    48.
    发明授权
    Apparatus and method for reducing execution latency of floating point operations having special case operands 有权
    具有特殊情况操作数的浮点运算减少执行延迟的装置和方法

    公开(公告)号:US07437538B1

    公开(公告)日:2008-10-14

    申请号:US10881763

    申请日:2004-06-30

    IPC分类号: G06F9/40

    摘要: An apparatus and method for floating-point special case handling. In one embodiment, a processor may include a first execution unit configured to execute a longer-latency floating-point instruction, and a second execution unit configured to execute a shorter-latency floating-point instruction. In response to the longer-latency floating-point instruction being issued to the first execution unit, the second execution unit may be further configured to detect whether a result of the longer-latency floating-point instruction is determinable from one or more operands of the longer-latency floating-point instruction independently of the first execution unit executing the longer-latency floating-point instruction. In response to detecting that the result is determinable, the second execution unit may be further configured to flush the longer-latency floating-point instruction from the first execution unit and to determine the result.

    摘要翻译: 一种用于浮点特殊情况处理的装置和方法。 在一个实施例中,处理器可以包括被配置为执行较长延迟浮点指令的第一执行单元,以及被配置为执行较短延迟浮点指令的第二执行单元。 响应于向第一执行单元发出的较长延迟的浮点指令,第二执行单元还可以被配置为检测长延迟浮点指令的结果是否可以从所述第一执行单元的一个或多个操作数确定 更长延迟的浮点指令独立于执行较长延迟浮点指令的第一执行单元。 响应于检测到结果是可确定的,第二执行单元可以被进一步配置为从第一执行单元刷新长延迟浮点指令并确定结果。

    Register window management using first pipeline to change current window and second pipeline to read operand from old window and write operand to new window
    49.
    发明授权
    Register window management using first pipeline to change current window and second pipeline to read operand from old window and write operand to new window 有权
    注册窗口管理使用第一个流水线更改当前窗口,第二个管道从旧窗口读取操作数,并将操作数写入新窗口

    公开(公告)号:US07216216B1

    公开(公告)日:2007-05-08

    申请号:US10881556

    申请日:2004-06-30

    摘要: In one embodiment, a processor is configured to execute a window swap instruction. The processor comprises a register file (that comprises a plurality of registers) and first and second execution units coupled to the register file. A first pipeline associated with the first execution unit has a first number of pipeline stages, and a second pipeline associated with the second execution unit has a second number of pipeline stages. The first execution unit is configured to change the current register window from the first register window to the second register window in the register file in response to the instruction. The second execution unit is configured to perform an operation defined by the instruction and write the result to the register file. The second number of pipeline stages exceeds the first number, whereby the second register window is established in the register file prior to writing the result.

    摘要翻译: 在一个实施例中,处理器被配置为执行窗口交换指令。 处理器包括寄存器文件(包括多个寄存器)以及耦合到寄存器文件的第一和第二执行单元。 与第一执行单元相关联的第一流水线具有第一数量的流水线级,并且与第二执行单元相关联的第二流水线具有第二数量的流水线级。 第一执行单元被配置为响应于指令将当前寄存器窗口从第一寄存器窗口改变到寄存器堆中的第二寄存器窗口。 第二执行单元被配置为执行由指令定义的操作,并将结果写入寄存器文件。 第二数量的流水线级超过第一个数字,从而在写入结果之前在寄存器文件中建立第二个寄存器窗口。

    Partitioned shifter for single instruction stream multiple data stream (SIMD) operations
    50.
    发明授权
    Partitioned shifter for single instruction stream multiple data stream (SIMD) operations 有权
    用于单指令流分多个数据流(SIMD)操作的分区移位器

    公开(公告)号:US07099910B2

    公开(公告)日:2006-08-29

    申请号:US10408132

    申请日:2003-04-07

    IPC分类号: G06F7/485

    摘要: A method of enabling a single instruction stream multiple data stream operation and a double precision floating point operation within a single floating point execution unit which includes providing a floating point unit with a two way aligner and a two way normalizer, selectively aligning a value based upon whether a single instruction stream multiple data stream operation is to be performed or a double precision operation is to be performed, and selectively normalizing a value based upon whether a single instruction stream multiple data stream operation is to be performed or a double precision operation is to be performed.

    摘要翻译: 一种在单个浮点执行单元内实现单指令流多数据流操作和双精度浮点运算的方法,该方法包括:提供具有双向对准器和双向归一化器的浮点单元, 要执行单个指令流多数据流操作还是执行双精度操作,并且基于是要执行单个指令流多数据流操作还是双精度操作来选择性地归一化值 被执行。