摘要:
A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.
摘要:
Detailed herein are embodiments of systems, methods, and apparatuses for decompression using hardware and software. For example, in embodiment a hardware apparatus comprises an input buffer to store incoming data from a compressed stream, a selector to select at least one byte stored in the input buffer, a decoder to decode the selected at least one byte and determine if the decoded at least one byte is a literal or a symbol, an overlap condition, a size of a record from the decoded stream, a length value of the data to be retrieved from the decoded stream, and an offset value for the decoded data, and a token format converter to convert the decoded data and data from source and destination offset base registers into a fixed-length token.
摘要:
Provided are a memory system, memory controller, and method for using a memory address to form a tweak key to use to encrypt and decrypt data. A base tweak co is generated as a function of an address of a block of data in the memory storage. For each sub-block of the block, performing: processing the base tweak to determine a sub-block tweak; combining the sub-block tweak with the sub-block to produce a modified sub-block; and performing an encryption operation comprising one of encryption or decryption on the modified sub-block to produce sub-block output comprising one of encrypted data and unencrypted data for the sub-block.
摘要:
According to one embodiment, a processor includes an instruction decoder to receive a first instruction to perform first SKEIN256 MIX-PERMUTE operations, the first instruction having a first operand associated with a first storage location to store a plurality of odd words, a second operand associated with a second storage location to store a plurality of even words, and a third operand. The processor further includes a first execution unit coupled to the instruction decoder, in response to the first instruction, to perform multiple rounds of the first SKEIN256 MIX-PERMUTE operations based on the odd words and even words using a first rotate value obtained from a third storage location indicated by the third operand, and to store new odd words in the first storage location indicated by the first operand.
摘要:
A multiply-and-accumulate (MAC) instruction allows efficient execution of unsigned integer multiplications. The MAC instruction indicates a first vector register as a first operand, a second vector register as a second operand, and a third vector register as a destination. The first vector register stores a first factor, and the second vector register stores a partial sum. The MAC instruction is executed to multiply the first factor with an implicit second factor to generate a product, and to add the partial sum to the product to generate a result. The first factor, the implicit second factor and the partial sum have a same data width and the product has twice the data width. The most significant half of the result is stored in the third vector register, and the least significant half of the result is stored in the second vector register.
摘要:
Vector instructions for performing ZUC stream cipher operations are received and executed by the execution circuitry of a processor. The execution circuitry receives a first vector instruction to perform an update to a liner feedback shift register (LFSR), and receives a second vector instruction to perform an update to a state of a finite state machine (FSM), where the FSM receives inputs from re-ordered bits of the LFSR. The execution circuitry executes the first vector instruction and the second vector instruction in a single-instruction multiple data (SIMD) pipeline.
摘要:
A processor includes a first execution unit to receive and execute a first instruction to process a first part of secure hash algorithm 256 (SHA256) message scheduling operations, the first instruction having a first operand associated with a first storage location to store a first set of message inputs and a second operand associated with a second storage location to store a second set of message inputs. The processor further includes a second execution unit to receive and execute a second instruction to process a second part of the SHA256 message scheduling operations, the second instruction having a third operand associated with a third storage location to store an intermediate result of the first part and a third set of message inputs and a fourth operand associated with a fourth storage location to store a fourth set of message inputs.
摘要:
According to one embodiment, a processor includes an instruction decoder to receive a first instruction to perform first SKEIN256 MIX-PERMUTE operations, the first instruction having a first operand associated with a first storage location to store a plurality of odd words, a second operand associated with a second storage location to store a plurality of even words, and a third operand. The processor further includes a first execution unit coupled to the instruction decoder, in response to the first instruction, to perform multiple rounds of the first SKEIN256 MIX-PERMUTE operations based on the odd words and even words using a first rotate value obtained from a third storage location indicated by the third operand, and to store new odd words in the first storage location indicated by the first operand.
摘要:
Technologies for executing a serial data processing algorithm on a single variable-length data buffer includes padding data segments of the buffer, streaming the data segments into a data register and executing the serial data processing algorithm on each of the segments in parallel.
摘要:
A method is described. The method includes iteratively performing for each position in a result matrix stored in a third register, multiplying a value at a matrix position stored in a first register with a value at a matrix position stored in a second register to obtain a first multiplicative value, where the positions in the first register and the second register are determined by the position in the result matrix and performing an exclusive or (XOR) operation with the first multiplicative value and a value stored at a result matrix position stored in the third register to obtain a result value.