-
公开(公告)号:US12130740B2
公开(公告)日:2024-10-29
申请号:US17712632
申请日:2022-04-04
申请人: Intel Corporation
发明人: Jason W. Brandt , Robert S. Chappell , Jesus Corbal , Edward T. Grochowski , Stephen H. Gunther , Buford M. Guy , Thomas R. Huff , Christopher J. Hughes , Elmoustapha Ould-Ahmed-Vall , Ronak Singhal , Seyed Yahya Sotoudeh , Bret L. Toll , Lihu Rappoport , David B. Papworth , James D. Allen
IPC分类号: G06F12/0831 , G06F9/30 , G06F9/38 , G06F12/1009 , G06F12/1027
CPC分类号: G06F12/0831 , G06F9/30043 , G06F9/384 , G06F12/1009 , G06F12/1027 , G06F2212/1016 , G06F2212/621 , G06F2212/68
摘要: Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.
-
公开(公告)号:US11900108B2
公开(公告)日:2024-02-13
申请号:US17461949
申请日:2021-08-30
申请人: Intel Corporation
发明人: Vinodh Gopal , James D. Guilford , Gilbert M. Wolrich , Wajdi K. Feghali , Erdinc Ozturk , Martin G. Dixon , Sean P. Mirkes , Bret L. Toll , Maxim Loktyukhin , Mark C. Davis , Alexandre J. Farcy
IPC分类号: G06F9/30
CPC分类号: G06F9/30032 , G06F9/30094 , G06F9/30098
摘要: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
-
3.
公开(公告)号:US11709961B2
公开(公告)日:2023-07-25
申请号:US17677958
申请日:2022-02-22
申请人: Intel Corporation
CPC分类号: G06F21/6227 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/30101 , G06F9/3802 , G06F16/27 , G06F21/6254 , G06F21/70
摘要: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.
-
公开(公告)号:US11360770B2
公开(公告)日:2022-06-14
申请号:US16487784
申请日:2017-07-01
申请人: Intel Corporation
发明人: Robert Valentine , Menachem Adelman , Zeev Sperber , Mark J. Charney , Bret L. Toll , Jesus Corbal , Alexander F. Heinecke , Barukh Ziv , Elmoustapha Ould-Ahmed-Vall , Stanislav Shwartsman
摘要: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.
-
公开(公告)号:US11275583B2
公开(公告)日:2022-03-15
申请号:US15668461
申请日:2017-08-03
申请人: Intel Corporation
发明人: Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein
摘要: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.
-
公开(公告)号:US11163565B2
公开(公告)日:2021-11-02
申请号:US16486960
申请日:2017-07-01
申请人: Intel Corporation
发明人: Robert Valentine , Dan Baum , Zeev Sperber , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , Mark J. Charney , Menachem Adelman , Barukh Ziv , Alexander Heinecke , Simon Rubanovich
摘要: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a double word with saturation; computing a dot product of bytes and accumulating in to a dword with saturation, where the input bytes can be signed or unsigned and the dword accumulation has output saturation; etc.
-
公开(公告)号:US11048507B2
公开(公告)日:2021-06-29
申请号:US16155028
申请日:2018-10-09
申请人: Intel Corporation
发明人: Robert Valentine , Doron Orenstein , Bret L. Toll
摘要: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.
-
公开(公告)号:US10430193B2
公开(公告)日:2019-10-01
申请号:US15995736
申请日:2018-06-01
申请人: Intel Corporation
发明人: Bret L. Toll , Buford M. Guy , Ronak Singhal , Mishali Naik
IPC分类号: G06F9/30
摘要: A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.
-
公开(公告)号:US10223227B2
公开(公告)日:2019-03-05
申请号:US14998052
申请日:2015-12-24
申请人: Intel Corporation
发明人: Ravi Rajwar , Bret L. Toll , Konrad K. Lai , Matthew C. Merten , Martin G. Dixon
IPC分类号: G06F15/00 , G06F7/38 , G06F9/00 , G06F9/44 , G06F11/28 , G06F9/46 , G06F9/30 , G06F12/0811 , G06F9/38 , G11C7/10 , G06F11/22 , G06F11/263 , G06F12/0897 , G06F12/0817 , G06F12/084 , G06F12/0862 , G06F12/0875 , G06F11/14 , G06F11/25
摘要: Novel instructions, logic, methods and apparatus are disclosed to test transactional execution status. Embodiments include decoding a first instruction to start a transactional region. Responsive to the first instruction, a checkpoint for a set of architecture state registers is generated and memory accesses from a processing element in the transactional region associated with the first instruction are tracked. A second instruction to detect transactional execution of the transactional region is then decoded. An operation is executed, responsive to decoding the second instruction, to determine if an execution context of the second instruction is within the transactional region. Then responsive to the second instruction, a first flag is updated. In some embodiments, a register may optionally be updated and/or a second flag may optionally be updated responsive to the second instruction.
-
公开(公告)号:US10210066B2
公开(公告)日:2019-02-19
申请号:US14998047
申请日:2015-12-24
申请人: Intel Corporation
发明人: Ravi Rajwar , Bret L. Toll , Konrad K. Lai , Matthew C. Merten , Martin G. Dixon
IPC分类号: G06F15/00 , G06F7/38 , G06F9/00 , G06F9/44 , G06F11/28 , G06F9/46 , G06F9/30 , G06F12/0811 , G06F9/38 , G11C7/10 , G06F11/22 , G06F12/0897 , G06F12/0817 , G06F12/084 , G06F12/0862 , G06F12/0875 , G06F11/14 , G06F11/25
摘要: Novel instructions, logic, methods and apparatus are disclosed to test transactional execution status. Embodiments include decoding a first instruction to start a transactional region. Responsive to the first instruction, a checkpoint for a set of architecture state registers is generated and memory accesses from a processing element in the transactional region associated with the first instruction are tracked. A second instruction to detect transactional execution of the transactional region is then decoded. An operation is executed, responsive to decoding the second instruction, to determine if an execution context of the second instruction is within the transactional region. Then responsive to the second instruction, a first flag is updated. In some embodiments, a register may optionally be updated and/or a second flag may optionally be updated responsive to the second instruction.
-
-
-
-
-
-
-
-
-