-
公开(公告)号:US12175246B2
公开(公告)日:2024-12-24
申请号:US18460497
申请日:2023-09-01
Applicant: Intel Corporation
Inventor: Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke
Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.
-
12.
公开(公告)号:US12021551B2
公开(公告)日:2024-06-25
申请号:US17133609
申请日:2020-12-23
Applicant: Intel Corporation
Inventor: James Guilford , Vinodh Gopal , Daniel Cutter , Kirk Yap
CPC classification number: H03M7/40 , G06F3/0608 , G06F3/0638 , G06F3/0673 , G11C15/00 , H03M7/3086
Abstract: Apparatus and method for efficient compression block decoding using content-addressable structure for header processing. For example, one embodiment of an apparatus comprises: a header parser to extract a sequence of tokens and corresponding length values from a header of a compression block, the tokens and corresponding length values associated with a type of compression used to compress a payload of the compression block; and a content-addressable data structure builder to construct a content-addressable data structure based on the tokens and length values, the content-addressable data structure builder to write an entry in the content-addressable data structure comprising a length value and a count value, the count value indicating a number of times the length value was previously written to an entry in the content-addressable data structure.
-
13.
公开(公告)号:US11955995B2
公开(公告)日:2024-04-09
申请号:US16872144
申请日:2020-05-11
Applicant: Intel Corporation
Inventor: James Guilford , Vinodh Gopal , Daniel Cutter , Kirk Yap , Wajdi Feghali , George Powley
CPC classification number: H03M7/607 , H03M7/3066 , H03M7/3084 , H03M7/46
Abstract: A lossless data compressor of an aspect includes a first lossless data compressor circuitry coupled to receive input data. The first lossless data compressor circuitry is to apply a first lossless data compression approach to compress the input data to generate intermediate compressed data. The apparatus also includes a second lossless data compressor circuitry coupled with the first lossless data compressor circuitry to receive the intermediate compressed data. The second lossless data compressor circuitry is to apply a second lossless data compression approach to compress at least some of the intermediate compressed data to generate compressed data. The second lossless data compression approach different than the first lossless data compression approach. Lossless data decompressors are also disclosed, as are methods of lossless data compression and decompression.
-
14.
公开(公告)号:US11516013B2
公开(公告)日:2022-11-29
申请号:US16022619
申请日:2018-06-28
Applicant: Intel Corporation
Inventor: James Guilford , Vinodh Gopal , Kirk Yap
Abstract: Disclosed embodiments relate to encrypting or decrypting confidential data with additional authentication data by an accelerator and a processor. In one example, a processor includes processor circuitry to compute a first hash of a first block of data stored in a memory, store the first hash in the memory, and generate an authentication tag based in part on a second hash. The processor further includes accelerator circuitry to obtain the first hash from the memory, decrypt a second block of data using the first hash, and compute the second hash based in part on the first hash and the second block of data.
-
公开(公告)号:US10719323B2
公开(公告)日:2020-07-21
申请号:US16144902
申请日:2018-09-27
Applicant: Intel Corporation
Inventor: Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke
Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.
-
16.
公开(公告)号:US11989582B2
公开(公告)日:2024-05-21
申请号:US17033760
申请日:2020-09-26
Applicant: Intel Corporation
Inventor: James Guilford , George Powley , Vinodh Gopal , Wajdi Feghali
CPC classification number: G06F9/4881 , G06F9/3887 , G06F2209/483
Abstract: Apparatus and method for performing low-latency multi-job submission via a single job descriptor is described herein. An apparatus embodiment includes a plurality of descriptor queues to stores job descriptors describing work to be performed and enqueue circuitry to receive a first job descriptor which includes a first field to store a Single Instruction Multiple Data (SIMD) width. If the SIMD width indicates that the first job descriptor is an SIMD job descriptor and open slots are available in the descriptor queues to store new job descriptors, then the enqueue circuitry is to generate a plurality of job descriptors based on fields of the first job descriptor and to store them in the open slots of the descriptor queues. The generated job descriptors are processed by processing pipelines to perform the work described. At least some of the generated job descriptors are processed concurrently or in parallel by different processing pipelines.
-
公开(公告)号:US11748103B2
公开(公告)日:2023-09-05
申请号:US17672253
申请日:2022-02-15
Applicant: Intel Corporation
Inventor: Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke
CPC classification number: G06F9/30178 , G06F9/3013 , G06F9/30036 , G06F9/30145 , G06F9/3802
Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.
-
18.
公开(公告)号:US10922079B2
公开(公告)日:2021-02-16
申请号:US15856245
申请日:2017-12-28
Applicant: Intel Corporation
Inventor: Vinodh Gopal , Kirk S. Yap , James Guilford , Simon N. Peffers
IPC: G06F9/00 , G06F9/30 , G06F16/2455 , G06F16/2453 , G06F16/245
Abstract: Data element filter logic (“hardware accelerator”) in a processor that offloads computation for an in-memory database select/extract operation from a Central Processing Unit (CPU) core in the processor is provided. The Data element filter logic provides a balanced performance across an entire range of widths (number of bits) of data elements in a column-oriented Database Management System.
-
公开(公告)号:US20190391869A1
公开(公告)日:2019-12-26
申请号:US16013710
申请日:2018-06-20
Applicant: Intel Corporation
Inventor: Vinodh Gopal , James Guilford , Daniel Cutter , Kirk Yap
Abstract: A processing device comprising compression circuitry to: determine a compression configuration to compress source data; generate a checksum of the source data in an uncompressed state; compress the source data into at least one block based on the compression configuration, wherein the at least one block comprises: a plurality of sub-blocks, wherein the plurality of sub-block includes a predetermined size; a block header corresponding to the plurality of sub-blocks; and decompression circuitry coupled to the compression circuitry, wherein the decompression circuitry to: while not outputting a decompressed data stream of the source data: generate index information corresponding to the plurality of sub-blocks; in response to generating the index information, generate a checksum of the compressed source data associated with the plurality of sub-blocks; and determine whether the checksum of the source data in the uncompressed format matches the checksum of the compressed source data.
-
公开(公告)号:US20190042611A1
公开(公告)日:2019-02-07
申请号:US15868594
申请日:2018-01-11
Applicant: Intel Corporation
Inventor: Kirk Yap , James Guilford , Daniel Cutter , Vinodh Gopal
IPC: G06F17/30
Abstract: Technologies for determining unique values include a computing device that further includes one or more accelerator devices. Each accelerator device is to receive input data and query configuration data, the input data including a packed array of unsigned integers of column data from a database and the query configuration data including an element width of the input data, and generate, in response to receiving the query configuration data, a bit-map output table based on the query configuration data, generate a write request for each element of the input data to set a corresponding bit-map output bit of the bit-map output table which corresponds to an element value of the corresponding element. Subsequently, the accelerator device is further to set the corresponding bit-map output bit to indicate a presence of the corresponding element and output the bit-map output table indicative of unique elements that are present in the input data.
-
-
-
-
-
-
-
-
-