摘要:
Instruction sets for variable length integer (varint) coding and associated methods and apparatus. The instructions sets include instructions for encoding and decoding varints, and may be included as a part of an instruction set architecture (ISA) for processors architectures such as x86 and Arm-based architectures, as well as other ISAs. In one aspect, the instructions include, a varint size encode instruction to encode a size of a varint, a varint encode instruction to encode a varint, a varint size decode instruction to decode a size of an encoded varint, and a varint decode instruction to decode an encoded varint. Varint encode size and encode instructions may be combined in a single instructions. Similarly, varint decode size and decode instructions may be combined in a single instruction. In one aspect, the instructions use a variable-length quantity (VLQ) encoding scheme under which varints are encoded into one or more VLQ octets.
摘要:
A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.
摘要:
In one embodiment, circuitry may generate digests to be combined to produce a hash value. The digests may include at least one digest and at least one other digest generated based at least in part upon at least one CRC value and at least one other CRC value. The circuitry may include cyclical redundancy check (CRC) generator circuitry to generate the at least one CRC value based at least in part upon at least one input string. The CRC generator circuitry also may generate the at least one other CRC value based least in part upon at least one other input string. The at least one other input string resulting at least in part from at least one pseudorandom operation involving, at least in part, the at least one input string. Many modifications, variations, and alternatives are possible without departing from this embodiment.
摘要:
In one embodiment, the present disclosure provides a method that includes segmenting an n-bit exponent e into a first segment et and a number t of k-bit segments ei in response to a request to determine a modular exponentiation result R, wherein R is a modular exponentiation of a generator base g for the exponent e and a q-bit modulus m, wherein the generator base g equals two and k is based at least in part on a processor configured to determine the result R; iteratively determining a respective intermediate modular exponentiation result for each segment ei, wherein the determining comprises multiplication, exponentiation and a modular reduction of at least one of a multiplication result and an exponentiation result; and generating the modular exponentiation result R=ge mod m based on, at least in part, at least one respective intermediate modular exponentiation result.
摘要翻译:在一个实施例中,本公开提供了一种方法,其包括响应于确定模幂运算结果R的请求,将n位指数e分割成第一段et和数目t的k比特段ei,其中R是 指数e的发生器基数g和q位模数m的模幂运算,其中发生器基g等于2,并且k至少部分地基于被配置为确定结果R的处理器; 迭代地确定每个段ei的相应的中间模幂运算结果,其中所述确定包括相乘结果和求幂结果中的至少一个的乘法,乘法和模块化减少; 并且至少部分地基于至少一个相应的中间模幂运算结果来产生模幂运算结果R = ge mod m。
摘要:
In one embodiment, the present disclosure provides a method that includes segmenting an n-bit exponent e into a first segment et and a number t of k-bit segments ei in response to a request to determine a modular exponentiation result R, wherein R is a modular exponentiation of a generator base g for the exponent e and a q-bit modulus m, wherein the generator base g equals two and k is based at least in part on a processor configured to determine the result R; iteratively determining a respective intermediate modular exponentiation result for each segment ei, wherein the determining comprises multiplication, exponentiation and a modular reduction of at least one of a multiplication result and an exponentiation result; and generating the modular exponentiation result R=ge mod m based on, at least in part, at least one respective intermediate modular exponentiation result.
摘要翻译:在一个实施例中,本公开提供了一种方法,其包括响应于确定模幂运算结果R的请求,将n位指数e分割成第一段et和数目t的k比特段ei,其中R是 指数e的发生器基数g和q位模数m的模幂运算,其中发生器基g等于2,并且k至少部分地基于被配置为确定结果R的处理器; 迭代地确定每个段ei的相应的中间模幂运算结果,其中所述确定包括相乘结果和求幂结果中的至少一个的乘法,乘法和模块化减少; 并且至少部分地基于至少一个相应的中间模幂运算结果来产生模幂运算结果R = ge mod m。
摘要:
A multiply-and-accumulate (MAC) instruction allows efficient execution of unsigned integer multiplications. The MAC instruction indicates a first vector register as a first operand, a second vector register as a second operand, and a third vector register as a destination. The first vector register stores a first factor, and the second vector register stores a partial sum. The MAC instruction is executed to multiply the first factor with an implicit second factor to generate a product, and to add the partial sum to the product to generate a result. The first factor, the implicit second factor and the partial sum have a same data width and the product has twice the data width. The most significant half of the result is stored in the third vector register, and the least significant half of the result is stored in the second vector register.
摘要:
Methods and apparatus are disclosed to reduce processor demands during encryption. A disclosed example method includes detecting a request for the processor to execute an encryption cipher determining whether the encryption cipher is associated with a byte reflection operation, preventing the byte reflection operation when a buffer associated with the encryption cipher will not cause a carryover condition, and incrementing the buffer via a shift operation before executing the encryption cipher.
摘要:
A processor includes a first execution unit to receive and execute a first instruction to process a first part of secure hash algorithm 256 (SHA256) message scheduling operations, the first instruction having a first operand associated with a first storage location to store a first set of message inputs and a second operand associated with a second storage location to store a second set of message inputs. The processor further includes a second execution unit to receive and execute a second instruction to process a second part of the SHA256 message scheduling operations, the second instruction having a third operand associated with a third storage location to store an intermediate result of the first part and a third set of message inputs and a fourth operand associated with a fourth storage location to store a fourth set of message inputs.
摘要:
Technologies for executing a serial data processing algorithm on a single variable-length data buffer includes padding data segments of the buffer, streaming the data segments into a data register and executing the serial data processing algorithm on each of the segments in parallel.
摘要:
A method is described. The method includes iteratively performing for each position in a result matrix stored in a third register, multiplying a value at a matrix position stored in a first register with a value at a matrix position stored in a second register to obtain a first multiplicative value, where the positions in the first register and the second register are determined by the position in the result matrix and performing an exclusive or (XOR) operation with the first multiplicative value and a value stored at a result matrix position stored in the third register to obtain a result value.