摘要:
A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
摘要:
An embodiment may include circuitry that may be capable of performing compression-related operations that may include: (a) indicating, at least in part, in a data structure at least one position of at least one subset of characters that are to be encoded as a symbol, (b) comparing, at least in part, at least one pair of multi-byte data words that are of identical predetermined fixed size, (c) maintaining, at least in part, an array of pointers to potentially matching strings that are to be compared with at least one currently examined string, and/or (d) allocating, at least in part, a first buffer portion to store at least one portion of uncompressed data from an application buffer that is to be input for compression to produce a compressed data stream. Other embodiments are described and claimed.
摘要:
Erasure code syndrome computation based on Reed Solomon (RS) operations in a Galois field to permit reconstruction of data of more than 2 failed storage units. Syndrome computation may be performed with coefficient exponents that consist of −1, 0, and 1. A product xD of a syndrome is computed as a left-shift of data byte D, and selective compensation based on the most significant bit of D. A product x−1D of a syndrome is computed as a right-shift of data byte D, and selective compensation based on the most significant bit of D. Compensation may include bit-wise XORing shift results with a constant derived from an irreducible polynomial associated with the Galois field. A set of erasure code syndromes may be computed for each of multiple nested arrays of independent storage units. Data reconstruction includes solving coefficients of the syndromes as a Vandermonde matrix.
摘要:
Methods, systems, and apparatuses are disclosed for implementing fast large-integer arithmetic within an integrated circuit, such as on IA (Intel Architecture) processors, in which such means include receiving a 512-bit value for squaring, the 512-bit value having eight sub-elements each of 64-bits and performing a 512-bit squaring algorithm by: (i) multiplying every one of the eight sub-elements by itself to yield a square of each of the eight sub-elements, the eight squared sub-elements collectively identified as T1, (ii) multiplying every one of the eight sub-elements by the other remaining seven of the eight sub-elements to yield an asymmetric intermediate result having seven diagonals therein, wherein each of the seven diagonals are of a different length, (iii) reorganizing the asymmetric intermediate result having the seven diagonals therein into a symmetric intermediate result having four diagonals each of 7×1 sub-elements of the 64-bits in length arranged across a plurality of columns, (iv) adding all sub-elements within their respective columns, the added sub-elements collectively identified as T2, and (v) yielding a final 512-bit squared result of the 512-bit value by adding the value of T2 twice with the value of T1 once. Other related embodiments are disclosed.
摘要:
A unified integer/Galois-Field 2m multiplier performs multiply operations for public-key systems such as Rivert, Shamir, Aldeman (RSA), Diffie-Hellman key exchange (DH) and Elliptic Curve Cryptosystem (ECC). The multiply operations may be performed on prime fields and different composite binary fields in independent multipliers in an interleaved fashion.
摘要:
A flexible aes instruction set for a general purpose processor is provided. The instruction set includes instructions to perform a “one round” pass for aes encryption or decryption and also includes instructions to perform key generation. An immediate may be used to indicate round number and key size for key generation for 128/192/256 bit keys. The flexible aes instruction set enables full use of pipelining capabilities because it does not require tracking of implicit registers.
摘要:
Disclosed is an integrated circuit including a memory device including a first portion and a second portion. The first portion is a first type of content addressable memory (CAM) with a first set of cells and the second portion is a second type of CAM with a second set of cells. The first set of cells is smaller than the second set of cells. The integrated circuit further includes a decompression accelerator coupled to the memory device, the decompression accelerator to generate a plurality of length codes. Each of the plurality of length codes include at least one bit. The plurality of length codes are generated using a symbol received from an encoded data stream that includes a plurality of symbols. The decompression accelerator further to store the plurality of length codes in the first portion of the memory device in an order according to their respective number of bits.
摘要:
A method and apparatus to perform Cyclic Redundancy Check (CRC) operations on a data block using a plurality of different n-bit polynomials is provided. A flexible CRC instruction performs a CRC operation using a programmable n-bit polynomial. The n-bit polynomial is provided to the CRC instruction by storing the n-bit polynomial in one of two operands.
摘要:
Methods and apparatus to perform string matching for network packet inspection are disclosed. In some embodiments there is a set of string matching slice circuits, each slice circuit of the set being configured to perform string matching steps in parallel with other slice circuits. Each slice circuit may include an input window storing some number of bytes of data from an input data steam. The input window of data may be padded if necessary, and then multiplied by a polynomial modulo an irreducible Galois-field polynomial to generate a hash index. A storage location of a memory corresponding to the hash index may be accessed to generate a slice-hit signal of a set of H slice-hit signals. The slice-hit signal may be provided to an AND-OR logic array where the set of H slice-hit signals is logically combined into a match result.
摘要:
Methods and apparatuses relating to high-performance authenticated encryption are described. A hardware accelerator may include a vector register to store an input vector of a round of an encryption operation; a circuit including a first data path including a first modular adder coupled to a first input from the vector register and a second input from the vector register, and a second modular adder coupled to the first modular adder and a second data path from the vector register, and the second data path including a first logical XOR circuit coupled to the second input and a third data path from the vector register, a first rotate circuit coupled to the first logical XOR circuit, a second logical XOR circuit coupled to the first rotate circuit and the third data path, and a second rotate circuit coupled to the second logical XOR circuit; and a control circuit to cause the first modular adder and the second modular adder of the first data path and the first logical XOR circuit, the second logical XOR circuit, the first rotate circuit, and the second rotate circuit of the second data path to perform a portion of the round according to one or more control values, and store a first result from the first data path for the portion and a second result from the second data path for the portion into the vector register.