Abstract:
Compression and decompression technology within a solid-state device (SSD) is disclosed that provides a good compression ratio while taking up less on-chip area. An input interface receives an input stream to be compressed. An output interface provides a compressed stream. A history buffer is of a fixed size that is a fraction of a size of a data buffer. Processing logic encodes into the compressed stream element types, literals and pointers, the latter which reference copies of data found elsewhere within the history buffer during compression. The history buffer may be multiple banks in width, where the data is loaded from the input stream sequentially across rows of the banks. The decompression side may be similarly designed, optionally with a different number of banks. The pointers may be a fixed two bytes including four bits for length and eleven bits for offset of back reference to a copy (or other combination).
Abstract:
Apparatus and method for efficient compression block decoding using content-addressable structure for header processing. For example, one embodiment of an apparatus comprises: a header parser to extract a sequence of tokens and corresponding length values from a header of a compression block, the tokens and corresponding length values associated with a type of compression used to compress a payload of the compression block; and a content-addressable data structure builder to construct a content-addressable data structure based on the tokens and length values, the content-addressable data structure builder to write an entry in the content-addressable data structure comprising a length value and a count value, the count value indicating a number of times the length value was previously written to an entry in the content-addressable data structure.
Abstract:
Methods and apparatuses relating to offload operations are described. In one embodiment, a hardware processor includes a core to execute a thread and offload an operation; and a first and second hardware accelerator to execute the operation, wherein the first and second hardware accelerator are coupled to shared buffers to store output data from the first hardware accelerator and provide the output data as input data to the second hardware accelerator, an input buffer descriptor array of the second hardware accelerator with an entry for each respective shared buffer, an input buffer response descriptor array of the second hardware accelerator with a corresponding response entry for each respective shared buffer, an output buffer descriptor array of the first hardware accelerator with an entry for each respective shared buffer, and an output buffer response descriptor array of the first hardware accelerator with a corresponding response entry for each respective shared buffer.
Abstract:
An apparatus and method for performing parallel decoding of prefix codes such as Huffman codes. For example, one embodiment of an apparatus comprises: a first decompression module to perform a non-speculative decompression of a first portion of a prefix code payload comprising a first plurality of symbols; and a second decompression module to perform speculative decompression of a second portion of the prefix code payload comprising a second plurality of symbols concurrently with the non-speculative decompression performed by the first compression module.
Abstract:
A processing system includes a memory and a cryptographic accelerator module operatively coupled to the memory, the cryptographic accelerator module employed to implement a byte substitute operation by performing: a first mapped affine transformation of an input bit sequence to produce a first intermediate bit sequence, an inverse transformation of the first intermediate bit sequence to produce a second intermediate bit sequence, and a second mapped affine transformation of the second intermediate bit sequence to produce an output bit sequence
Abstract:
Vector instructions for performing SNOW 3G wireless security operations are received and executed by the execution circuitry of a processor. The execution circuitry receives a first operand of the first instruction specifying a first vector register that stores a current state of a finite state machine (FSM). The execution circuitry also receives a second operand of the first instruction specifying a second vector register that stores data elements of a liner feedback shift register (LFSR) that are needed for updating the FSM. The execution circuitry executes the first instruction to produce a updated state of the FSM and an output of the FSM in a destination operand of the first instruction.
Abstract:
An apparatus and method for performing parallel decoding of prefix codes such as Huffman codes. For example, one embodiment of an apparatus comprises: a first decompression module to perform a non-speculative decompression of a first portion of a prefix code payload comprising a first plurality of symbols; and a second decompression module to perform speculative decompression of a second portion of the prefix code payload comprising a second plurality of symbols concurrently with the non-speculative decompression performed by the first compression module.
Abstract:
Systems, methods, and apparatuses for low-latency page efficient chained decryption and decompression acceleration are described. In one embodiment, a processor comprises a hardware processor core, and an accelerator circuit coupled to the hardware processor core, the accelerator circuit to: in response to a descriptor, comprising an indication of a hash key and encrypted data to be decrypted, from the hardware processor core, perform a determination that the encrypted data is to be read in an encrypted order or a reverse order from the encrypted order, in response to the determination that the encrypted data is to be read in the reverse order, generate a resultant authentication tag in the reverse order for the encrypted data based at least in part on the hash key without reordering the encrypted data in the reverse order into the encrypted order, and, in response to the determination that the encrypted data is to be read in the encrypted order, generate the resultant authentication tag in the encrypted order for the encrypted data based at least in part on the hash key.
Abstract:
An apparatus and method for loading and storing multiple sets of packed data elements. For example, one embodiment of a processor comprises: a decoder to decode a multiple load instruction to generate a decoded multiple load instruction comprising a plurality of operations, the multiple load instruction including an opcode, source operands, and at least one destination operand; a first source register to store N packed index values; a second source register to store a base address value; execution circuitry to execute the operations of the decoded multiple load instruction, the execution circuitry comprising: parallel address generation circuitry to combine the base address from the second source register with each of the N packed index values to generate N system memory addresses; data load circuitry to cause N sets of data elements to be retrieved from the N system memory addresses, the data load circuitry to store the N sets of data elements in N vector destination registers identified by the at least one destination operand.
Abstract:
Techniques and apparatus for discrete compression and decompression processes are described. In one embodiment, for example, an apparatus may include at least one memory and logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to determine a compression configuration to compress source data, generate discrete compressed data comprising at least one high-level block comprising a header and at least one discrete block based on the compression configuration, and generate index information for accessing the at least one discrete block. Other embodiments are described and claimed.