Method and apparatus for efficient programmable cyclic redundancy check (CRC)
    43.
    发明授权
    Method and apparatus for efficient programmable cyclic redundancy check (CRC) 有权
    用于高效可编程循环冗余校验(CRC)的方法和装置

    公开(公告)号:US09052985B2

    公开(公告)日:2015-06-09

    申请号:US11963147

    申请日:2007-12-21

    IPC分类号: G06F7/72 H03M13/09

    CPC分类号: G06F7/724 G06F7/72 H03M13/09

    摘要: A method and apparatus to optimize each of the plurality of reduction stages in a Cyclic Redundancy Check (CRC) circuit to produce a residue for a block of data decreases area used to perform the reduction while maintaining the same delay through the plurality of stages of the reduction logic. A hybrid mix of Karatsuba algorithm, classical multiplications and serial division in various stages in the CRC reduction circuit results in about a twenty percent reduction in area on the average with no decrease in critical path delay.

    摘要翻译: 一种在循环冗余校验(CRC)电路中优化多个还原级中的每一个以产生数据块的残差的方法和装置减少了用于执行减少的区域,同时通过多个阶段保持相同的延迟 还原逻辑。 在CRC减少电路中,Karatsuba算法,经典乘法和串行划分的混合混合结果导致平均面积减少了约20%,而关键路径延迟没有减少。

    INSTRUCTION FOR ACCELERATING SNOW 3G WIRELESS SECURITY ALGORITHM
    44.
    发明申请
    INSTRUCTION FOR ACCELERATING SNOW 3G WIRELESS SECURITY ALGORITHM 有权
    加速雪的指令3G无线安全算法

    公开(公告)号:US20140189289A1

    公开(公告)日:2014-07-03

    申请号:US13730216

    申请日:2012-12-28

    IPC分类号: G06F15/80

    摘要: Vector instructions for performing SNOW 3G wireless security operations are received and executed by the execution circuitry of a processor. The execution circuitry receives a first operand of the first instruction specifying a first vector register that stores a current state of a finite state machine (FSM). The execution circuitry also receives a second operand of the first instruction specifying a second vector register that stores data elements of a liner feedback shift register (LFSR) that are needed for updating the FSM. The execution circuitry executes the first instruction to produce a updated state of the FSM and an output of the FSM in a destination operand of the first instruction.

    摘要翻译: 用于执行SNOW 3G无线安全操作的矢量指令由处理器的执行电路接收和执行。 执行电路接收指定存储有限状态机(FSM)的当前状态的第一向量寄存器的第一指令的第一操作数。 执行电路还接收指定第二向量寄存器的第一指令的第二操作数,该第二指令寄存器存储用于更新FSM所需的线性反馈移位寄存器(LFSR)的数据元素。 执行电路执行第一指令以产生FSM的更新状态和FSM在第一指令的目的地操作数中的输出。

    Bitstream processing using coalesced buffers and delayed matching and enhanced memory writes
    45.
    发明授权
    Bitstream processing using coalesced buffers and delayed matching and enhanced memory writes 有权
    使用合并缓冲区和延迟匹配和增强存储器写入的位流处理

    公开(公告)号:US09203887B2

    公开(公告)日:2015-12-01

    申请号:US13994129

    申请日:2011-12-23

    IPC分类号: H04L29/06 H04W28/06 H03M7/30

    摘要: Methods and apparatus for processing bitstreams and byte streams. According to one aspect, bitstream data is compressed using coalesced string match tokens with delayed matching. A matcher is employed to perform search string match operations using a shortened maximum string length search criteria, resulting in generation of a token stream having data and literal data. A distance match operation is performed on sequentially adjacent tokens to determine if they contain the same distance data. If they do, the len values of the tokens are added through use of a coalesce buffer. Upon detection of a distance non-match, a final coalesced length of a matching string is calculated and output along with the prior matching distance as a coalesced token. Also disclosed is a scheme for writing variable-length tokens into a bitstream under which token data is input into a bit accumulator and written to memory (or cache to be subsequently written to memory) as each token is processed in a manner that eliminates branch mispredict operations associated with detecting whether the bit accumulator is full or close to full.

    摘要翻译: 用于处理比特流和字节流的方法和装置。 根据一个方面,使用具有延迟匹配的合并字符串匹配令牌来压缩比特流数据。 使用匹配器来执行搜索字符串匹配操作,使用缩短的最大字符串长度搜索条件,导致生成具有数据和文字数据的令牌流。 对顺序相邻的令牌执行距离匹配操作,以确定它们是否包含相同的距离数据。 如果这样做,令牌的len值通过使用合并缓冲区来添加。 在检测到距离不匹配时,计算匹配串的最终合并长度,并将其与先前匹配距离一起作为合并令牌输出。 还公开了一种用于将可变长度令牌写入比特流的方案,在该比特流中,令牌数据被输入到比特累加器中,并且以消除分支错误预测的方式将每个令牌进行处理,并将其写入存储器(或高速缓存以随后写入存储器) 检测位累加器是满或接近满的操作。

    Generating Multiple Secure Hashes from a Single Data Buffer
    47.
    发明申请
    Generating Multiple Secure Hashes from a Single Data Buffer 有权
    从单个数据缓冲区生成多个安全哈希

    公开(公告)号:US20150098563A1

    公开(公告)日:2015-04-09

    申请号:US14050326

    申请日:2013-10-09

    IPC分类号: H04L9/06

    摘要: One embodiment provides an apparatus. The apparatus includes a single instruction multiple data (SIMD) hash module configured to apportion at least a first portion of a message of length L to a number (S) of segments, the message including a plurality of sequences of data elements, each sequence including S data elements, a respective data element in each sequence apportioned to a respective segment, each segment including a number N of blocks of data elements and to hash the S segments in parallel, resulting in S segment digests, the S hash digests based, at least in part, on an initial value and to store the S hash digests; a padding module configured to pad a remainder, the remainder corresponding to a second portion of the message, the second portion related to the length L of the message, the number of segments and a block size; and a non-SIMD hash module configured to hash the padded remainder, resulting in an additional hash digest and to store the additional hash digest.

    摘要翻译: 一个实施例提供了一种装置。 该装置包括单个指令多数据(SIMD)散列模块,其被配置为将长度为L的消息的至少第一部分分配给数量(S)个段,该消息包括多个数据元素序列,每个序列包括 S个数据元素,分配给相应段的每个序列中的相应数据元素,每个段包括N个数据元素块,并且并行地对S个段进行散列,导致S段摘要,基于S个散列摘要 至少部分地在初始值上存储S哈希摘要; 填充模块,被配置为填补余数,剩余部分对应于消息的第二部分,与消息的长度L相关的第二部分,段的数量和块大小; 以及非SIMD散列模块,被配置为对填充的余数进行散列,产生附加的散列摘要并存储附加散列摘要。

    METHOD AND APPARATUS TO PROCESS 4-OPERAND SIMD INTEGER MULTIPLY-ACCUMULATE INSTRUCTION
    48.
    发明申请
    METHOD AND APPARATUS TO PROCESS 4-OPERAND SIMD INTEGER MULTIPLY-ACCUMULATE INSTRUCTION 有权
    过程4操作的方法和装置SIMD INTEGER MULTIPLY-ACCUMULATE指令

    公开(公告)号:US20140082328A1

    公开(公告)日:2014-03-20

    申请号:US13617021

    申请日:2012-09-14

    IPC分类号: G06F9/302 G06F9/30

    摘要: According to one embodiment, a processor includes an instruction decoder to receive an instruction to process a multiply-accumulate operation, the instruction having a first operand, a second operand, a third operand, and a fourth operand. The first operand is to specify a first storage location to store an accumulated value; the second operand is to specify a second storage location to store a first value and a second value; and the third operand is to specify a third storage location to store a third value. The processor further includes an execution unit coupled to the instruction decoder to perform the multiply-accumulate operation to multiply the first value with the second value to generate a multiply result and to accumulate the multiply result and at least a portion of a third value to an accumulated value based on the fourth operand.

    摘要翻译: 根据一个实施例,处理器包括指令解码器,用于接收处理多重累积运算的指令,该指令具有第一操作数,第二操作数,第三操作数和第四操作数。 第一个操作数是指定一个存储累积值的第一个存储位置; 第二操作数是指定存储第一值和第二值的第二存储位置; 并且第三操作数是指定存储第三值的第三存储位置。 所述处理器还包括执行单元,其耦合到所述指令解码器以执行所述乘法运算,以将所述第一值乘以所述第二值以产生乘法结果,并将乘法结果和第三值的至少一部分累积到 基于第四操作数的累计值。

    Method and apparatus to process 4-operand SIMD integer multiply-accumulate instruction
    49.
    发明授权
    Method and apparatus to process 4-operand SIMD integer multiply-accumulate instruction 有权
    处理4操作数SIMD整数乘法累加指令的方法和装置

    公开(公告)号:US09292297B2

    公开(公告)日:2016-03-22

    申请号:US13617021

    申请日:2012-09-14

    IPC分类号: G06F9/00 G06F9/38 G06F9/30

    摘要: According to one embodiment, a processor includes an instruction decoder to receive an instruction to process a multiply-accumulate operation, the instruction having a first operand, a second operand, a third operand, and a fourth operand. The first operand is to specify a first storage location to store an accumulated value; the second operand is to specify a second storage location to store a first value and a second value; and the third operand is to specify a third storage location to store a third value. The processor further includes an execution unit coupled to the instruction decoder to perform the multiply-accumulate operation to multiply the first value with the second value to generate a multiply result and to accumulate the multiply result and at least a portion of a third value to an accumulated value based on the fourth operand.

    摘要翻译: 根据一个实施例,处理器包括指令解码器,用于接收处理多重累积运算的指令,该指令具有第一操作数,第二操作数,第三操作数和第四操作数。 第一个操作数是指定一个存储累积值的第一个存储位置; 第二操作数是指定存储第一值和第二值的第二存储位置; 并且第三操作数是指定存储第三值的第三存储位置。 所述处理器还包括执行单元,其耦合到所述指令解码器以执行所述乘法运算,以将所述第一值乘以所述第二值以产生乘法结果,并将乘法结果和第三值的至少一部分累积到 基于第四操作数的累计值。

    INSTRUCTION FOR FAST ZUC ALGORITHM PROCESSING
    50.
    发明申请
    INSTRUCTION FOR FAST ZUC ALGORITHM PROCESSING 有权
    用于快速ZUC算法处理的指令

    公开(公告)号:US20140189290A1

    公开(公告)日:2014-07-03

    申请号:US13730230

    申请日:2012-12-28

    IPC分类号: G06F15/76

    摘要: Vector instructions for performing ZUC stream cipher operations are received and executed by the execution circuitry of a processor. The execution circuitry receives a first vector instruction to perform an update to a liner feedback shift register (LFSR), and receives a second vector instruction to perform an update to a state of a finite state machine (FSM), where the FSM receives inputs from re-ordered bits of the LFSR. The execution circuitry executes the first vector instruction and the second vector instruction in a single-instruction multiple data (SIMD) pipeline.

    摘要翻译: 用于执行ZUC流密码操作的矢量指令由处理器的执行电路接收和执行。 执行电路接收第一向量指令以对线性反馈移位寄存器(LFSR)进行更新,并且接收第二向量指令以对有限状态机(FSM)的状态进行更新,其中FSM接收来自 重新排列了LFSR的位。 执行电路在单指令多数据(SIMD)流水线中执行第一向量指令和第二向量指令。