-
11.
公开(公告)号:US10248488B2
公开(公告)日:2019-04-02
申请号:US14983026
申请日:2015-12-29
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Suleyman Sair , Kshitij A. Doshi , Charles R. Yount
Abstract: Systems, methods, and apparatuses for fault tolerance and detection are described. For example, an apparatus including circuitry to replicate input sources of an instruction; arithmetic logic unit (ALU) circuitry to execute the instruction with replicated input sources using single instruction, multiple data (SIMD) hardware to produce a packed data result; and comparison circuitry coupled to the ALU circuitry to evaluate the packed data result and output a singular data result into a destination of the instruction is described.
-
公开(公告)号:US09792119B2
公开(公告)日:2017-10-17
申请号:US15267668
申请日:2016-09-16
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Kshitij A. Doshi , Suleyman Sair , Charles R. Yount
CPC classification number: G06F9/30036 , G06F7/22 , G06F7/544 , G06F9/30018 , G06F9/30021 , G06F9/30101 , G06F9/30145 , G06F9/3016 , G06F11/1048 , G06F11/1479
Abstract: Instructions and logic provide vector horizontal majority voting functionality. Some embodiments, responsive to an instruction specifying: a destination operand, a size of the vector elements, a source operand, and a mask corresponding to a portion of the vector element data fields in the source operand; read a number of values from data fields of the specified size in the source operand, corresponding to the mask specified by the instruction and store a result value to that number of corresponding data fields in the destination operand, the result value computed from the majority of values read from the number of data fields of the source operand.
-
13.
公开(公告)号:US20170185132A1
公开(公告)日:2017-06-29
申请号:US14757903
申请日:2015-12-23
Applicant: Intel Corporation
Inventor: Devadatta Bodas , Meenakshi Arunachalam , Ilya Sharapov , Charles R. Yount , Scott B. Huck , Ramakrishna Huggahalli , Justin J. Song , Brian J. Griffith , Muralidhar Rajappa , Lingdan (Linda) Zeng
CPC classification number: G06F1/3206 , G06F1/324 , G06F11/3428 , Y02D10/126
Abstract: A method of assessing energy efficiency of a High-performance computing (HPC) system, including: selecting a plurality of HPC workloads to run on a system under test (SUT) with one or more power constraints, wherein the SUT includes a plurality of HPC nodes in the HPC system, executing the plurality of HPC workloads on the SUT, and generating a benchmark metric for the SUT based on a baseline configuration for each selected HPC workload and a plurality of measured performance per power values for each executed workload at each selected power constraint is shown.
-
公开(公告)号:US20170177349A1
公开(公告)日:2017-06-22
申请号:US14977356
申请日:2015-12-21
Applicant: Intel Corporation
CPC classification number: G06F9/30029 , G06F9/30 , G06F9/3802 , G06F9/3818 , G06F15/8007
Abstract: A processor includes an execution unit to execute instructions to load indices from an array of indices, optionally perform a gather, and prefetch (to a specified cache) elements for a future gather from arbitrary locations in memory. The execution unit includes logic to load, for each element to be gathered or prefetched, an index value to be used in computing the address in memory for the element. The index value may be retrieved from an array of indices that is identified for the instruction. The execution unit includes logic to compute the address based on the sum of a base address that is specified for the instruction and the index value that was retrieved for the data element, with or without scaling. The execution unit includes logic to store gathered data elements in contiguous locations in a destination vector register that is specified for the instruction.
-
公开(公告)号:US20170177346A1
公开(公告)日:2017-06-22
申请号:US14975809
申请日:2015-12-20
Applicant: Intel Corporation
CPC classification number: G06F12/0875 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/30047 , G06F9/345 , G06F9/35 , G06F9/3555 , G06F9/383 , G06F12/0862 , G06F2212/1016 , G06F2212/452 , G06F2212/6028
Abstract: A processor includes an execution unit to execute instructions to load indices from an array of indices, optionally perform scatters, and prefetch (to a specified cache) contents of target locations for future scatters from arbitrary locations in memory. The execution unit includes logic to load, for each target location of a scatter or prefetch operation, an index value to be used in computing the address in memory for the operation. The index value may be retrieved from an array of indices identified for the instruction. The execution unit includes logic to compute the addresses based on the sum of a base address specified for the instruction, the index value retrieved for the location, and a prefetch offset (for prefetch operations), with optional scaling. The execution unit includes logic to retrieve data elements from contiguous locations in a source vector register specified for the instruction to be scattered to the memory.
-
公开(公告)号:US10540177B2
公开(公告)日:2020-01-21
申请号:US15438712
申请日:2017-02-21
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Suleyman Sair , Kshitij A. Doshi , Charles R. Yount , Bret L. Toll
Abstract: A processor core including a hardware decode unit to decode vector instructions for decompressing a run length encoded (RLE) set of source data elements and an execution unit to execute the decoded instructions. The execution unit generates a first mask by comparing set of source data elements with a set of zeros and then counts the trailing zeros in the mask. A second mask is made based on the count of trailing zeros. The execution unit then copies the set of source data elements to a buffer using the second mask and then reads the number of RLE zeros from the set of source data elements. The buffer is shifted and copied to a result and the set of source data elements is shifted to the right. If more valid data elements are in the set of source data elements this is repeated until all valid data is processed.
-
公开(公告)号:US20170177363A1
公开(公告)日:2017-06-22
申请号:US14979231
申请日:2015-12-22
Applicant: Intel Corporation
CPC classification number: G06F9/30036 , G06F9/30043 , G06F9/30101 , G06F9/3016 , G06F9/345 , G06F9/3555 , G06F12/0862 , G06F12/0875 , G06F2212/1016 , G06F2212/452 , G06F2212/6028
Abstract: A processor includes an execution unit to execute instructions to load indices from an array of indices and gather elements from random locations or locations in sparse memory based on those indices. The execution unit includes logic to load, for each data element to be gathered by the instruction, as needed, an index value to be used in computing the address in memory of a particular data element to be gathered. The index value may be retrieved from an array of indices that is identified for the instruction. The execution unit includes logic to compute the address as the sum of a base address that is specified for the instruction and the index value that was retrieved for the data element, with or without scaling. The execution unit includes logic to store the gathered data elements in contiguous locations in a destination vector register that is specified for the instruction.
-
-
-
-
-
-