Abstract:
Methods and devices for use with the advanced encryption standard (AES) are presented including a processor comprising a decode unit to decode a single round encryption instruction to perform an AES single round encryption operation, wherein the single round encryption instruction specifies a destination register to store 128-bit input data and a source register to store a 128-bit round key; and an execution unit to execute micro-operations based on the single round encryption instruction, wherein the execution unit is to receive the 128-bit input data and the 128-bit round key, and wherein the execution unit is to perform the AES single round encryption operation on the 128-bit input data using the round key and to store 128-bit result data in the destination register.
Abstract:
A system for an agnostic runtime architecture. The system includes a system emulation/virtualization converter, an application code converter, and a system converter wherein the system emulation/virtualization converter and the application code converter implement a system emulation process, and wherein the system converter implements a system conversion process for executing code from a guest image. The system converter further comprises an instruction fetch component for fetching an incoming microinstruction sequence, a decoding component coupled to the instruction fetch component to receive the fetched macro instruction sequence and decode into a microinstruction sequence, and an allocation and issue stage coupled to the decoding component to receive the microinstruction sequence perform optimization processing by reordering the microinstruction sequence into an optimized microinstruction sequence comprising a plurality of dependent code groups. A microprocessor pipeline is coupled to the allocation and issue stage to receive and execute the optimized microinstruction sequence. A sequence cache is coupled to the allocation and issue stage to receive and store a copy of the optimized microinstruction sequence for subsequent use upon a subsequent hit on the optimized microinstruction sequence, and a hardware component is coupled for moving instructions in the incoming microinstruction sequence.
Abstract:
A flexible aes instruction set for a general purpose processor is provided. The instruction set includes instructions to perform a 'one round' pass for aes encryption or decryption and also includes instructions to perform key generation. An immediate may be used to indicate round number and key size for key generation for 128/192/256 bit keys. The flexible aes instruction set enables full use of pipelining capabilities because it does not require tracking of implicit registers.
Abstract:
Technologies for data decompression include a computing device that reads a symbol tag byte from an input stream. The computing device determines whether the symbol can be decoded using a fast-path routine, and if not, executes a slow-path routine to decompress the symbol. The slow-path routine may include data-dependent branch instructions that may be unpredictable using branch prediction hardware. For the fast-path routine, the computing device determines a next symbol increment value, a literal increment value, a data length, and an offset based on the tag byte, without executing an unpredictable branch instruction. The computing device sets a source pointer to either literal data or reference data as a function of the tag byte, without executing an unpredictable branch instruction. The computing device may set the source pointer using a conditional move instruction. The computing device copies the data and processes remaining symbols. Other embodiments are described and claimed.
Abstract:
An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.
Abstract:
Methods and devices for use with the advanced encryption standard (AES) are presented including a processor comprising a decode unit to decode a single round encryption instruction to perform an AES single round encryption operation, wherein the single round encryption instruction specifies a destination register to store 128-bit input data and a source register to store a 128-bit round key; and an execution unit to execute micro-operations based on the single round encryption instruction, wherein the execution unit is to receive the 128-bit input data and the 128-bit round key, and wherein the execution unit is to perform the AES single round encryption operation on the 128-bit input data using the round key and to store 128-bit result data in the destination register.
Abstract:
Methods and devices for use with the advanced encryption standard (AES) are presented including a processor comprising a decode unit to decode a single round encryption instruction to perform an AES single round encryption operation, wherein the single round encryption instruction specifies a destination register to store 128-bit input data and a source register to store a 128-bit round key; and an execution unit to execute micro-operations based on the single round encryption instruction, wherein the execution unit is to receive the 128-bit input data and the 128-bit round key, and wherein the execution unit is to perform the AES single round encryption operation on the 128-bit input data using the round key and to store 128-bit result data in the destination register.
Abstract:
Technologies for data decompression include a computing device that reads a symbol tag byte from an input stream. The computing device determines whether the symbol can be decoded using a fast-path routine, and if not, executes a slow-path routine to decompress the symbol. The slow-path routine may include data-dependent branch instructions that may be unpredictable using branch prediction hardware. For the fast-path routine, the computing device determines a next symbol increment value, a literal increment value, a data length, and an offset based on the tag byte, without executing an unpredictable branch instruction. The computing device sets a source pointer to either literal data or reference data as a function of the tag byte, without executing an unpredictable branch instruction. The computing device may set the source pointer using a conditional move instruction. The computing device copies the data and processes remaining symbols. Other embodiments are described and claimed.