-
公开(公告)号:EP3506087A1
公开(公告)日:2019-07-03
申请号:EP18209326.0
申请日:2018-11-29
申请人: INTEL Corporation
发明人: HUGHES, Christopher J. , NUZMAN, Joseph , SVENNEBRING, Jonas , JAYASIMHA, Doddaballapur N. , SURY, Samantika S. , KOUFATY, David A. , MCDONNELL, Niall D. , LIU, Yen-Cheng , VAN DOREN, Stephen R. , ROBINSON, Stephen J.
摘要: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations. In one example, a system includes an RAO instruction queue stored in a memory and having entries grouped by destination cache line, each entry to enqueue an RAO instruction including an opcode, a destination identifier, and source data, optimization circuitry to receive an incoming RAO instruction, scan the RAO instruction queue to detect a matching enqueued RAO instruction identifying a same destination cache line as the incoming RAO instruction, the optimization circuitry further to, responsive to no matching enqueued RAO instruction being detected, enqueue the incoming RAO instruction; and, responsive to a matching enqueued RAO instruction being detected, determine whether the incoming and matching RAO instructions have a same opcode to non-overlapping cache line elements, and, if so, spatially combine the incoming and matching RAO instructions by enqueuing both RAO instructions in a same group of cache line queue entries at different offsets.
-
公开(公告)号:EP3238043A1
公开(公告)日:2017-11-01
申请号:EP15873965.6
申请日:2015-11-23
申请人: Intel Corporation
CPC分类号: G06F9/30036 , G06F9/30018 , G06F9/30021 , G06F9/30047 , G06F9/30112 , G06F9/3834 , G06F9/3838
摘要: An apparatus and method are described for performing conflict detection operations. For example, one embodiment of a processor comprises: a first source vector register to store a first set of data elements; a second source vector register to store a second set of data elements; conflict detection logic to perform a specified comparison operation comparing each of the first set of data elements with specified data elements from the second set and generating a set of comparison results, the comparison operation to be selected from a group consisting of a greater than comparison, a less than comparison, a greater than or equal to comparison, a less than or equal to comparison, and a not equal to comparison.
摘要翻译: 描述了用于执行冲突检测操作的设备和方法。 例如,处理器的一个实施例包括:第一源向量寄存器,用于存储第一组数据元素; 第二源向量寄存器,用于存储第二组数据元素; 冲突检测逻辑,用于执行指定的比较操作,将第一组数据元素中的每一个与来自第二组中的指定数据元素进行比较,并且生成一组比较结果,比较操作从包括大于比较, 小于比较,大于或等于比较,小于或等于比较以及不等于比较。
-
公开(公告)号:EP3889787A1
公开(公告)日:2021-10-06
申请号:EP21168711.6
申请日:2016-12-12
申请人: Intel Corporation
发明人: BRANDT, Jason W. , CHAPPELL, Robert S. , CORBAL, Jesus , GROCHOWSKI, Edward T. , GUNTHER, Stephen H. , GUY, Buford M. , HUFF, Thomas R. , HUGHES, Christopher J. , OULD-AHMED-VALL, Elmoustapha , SINGHAL, Ronak , SOTOUDEH, Seyed Yahya , TOLL, Bret L. , RAPPOPORT, Lihu , PAPWORTH, David , ALLEN, James D.
IPC分类号: G06F12/0808 , G06F12/0817 , G06F12/0831
摘要: Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.
-
公开(公告)号:EP3623940A2
公开(公告)日:2020-03-18
申请号:EP19183497.7
申请日:2019-06-28
申请人: Intel Corporation
发明人: HUGHES, Christopher J. , TOLL, Bret , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.
摘要: Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.
-
5.
公开(公告)号:EP3394724A1
公开(公告)日:2018-10-31
申请号:EP16879673.8
申请日:2016-11-18
申请人: Intel Corporation
CPC分类号: G06F9/30043 , G06F9/30101 , G06F9/3016 , G06F9/467 , G06F12/0842 , G06F12/0891 , G06F12/1009 , G06F12/1027
摘要: A processor of an aspect includes a decode unit to decode a transaction end plus commit to persistence instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to atomically ensure that all prior store to memory operations made to a persistent memory, which are to have been accepted to memory when performance of the instruction begins, but which are not necessarily to have been stored in the persistent memory when the performance of the instruction begins, are to be stored in the persistent memory before the instruction becomes globally visible. The execution unit, in response to the instruction, is also to atomically end a transactional memory transaction before the instruction becomes globally visible.
-
公开(公告)号:EP4449262A1
公开(公告)日:2024-10-23
申请号:EP21967617.8
申请日:2021-12-15
申请人: INTEL Corporation
发明人: WU, Keqiang , XIANG, Lingxiang , PAN, Heidi , HUGHES, Christopher J. , WANG, Zhe
IPC分类号: G06F12/08
-
公开(公告)号:EP4141661A1
公开(公告)日:2023-03-01
申请号:EP22200756.9
申请日:2019-06-26
申请人: Intel Corporation
发明人: TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.
IPC分类号: G06F9/30
摘要: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, an apparatus comprises a configuration storage to store configuration information for a two-dimensional (2D) matrix storage, the configuration information to include a first value indicative of a number of rows of the 2D matrix storage and a second value indicative of a number of columns of the 2D matrix storage, fetch circuitry to fetch an instruction, the instruction to specify the 2D matrix storage, a row of the 2D matrix storage, and a 512-bit vector register, decode circuitry, coupled with the fetch circuitry, to decode the instruction, and execution circuitry, coupled with the decode circuitry, to perform operations corresponding to the instruction, including to store the row of the 2D matrix storage to the 512-bit vector register.
-
公开(公告)号:EP3629154A3
公开(公告)日:2020-05-06
申请号:EP19182737.7
申请日:2019-06-26
申请人: INTEL Corporation
发明人: TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.
IPC分类号: G06F9/30
摘要: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
-
公开(公告)号:EP3552108A1
公开(公告)日:2019-10-16
申请号:EP16923787.2
申请日:2016-12-12
申请人: Intel Corporation
发明人: BRANDT, Jason W. , CHAPPELL, Robert S. , CORBAL, Jesus , GROCHOWSKI, Edward T. , GUNTHER, Stephen H. , GUY, Buford M. , HUFF, Thomas R. , HUGHES, Christopher J. , OULD-AHMED-VALL, Elmoustapha , SINGHAL, Ronak , SOTOUDEH, Seyed Yahya , TOLL, Bret L. , RAPPOPORT, Lihu , PAPWORTH, David , ALLEN, James D.
IPC分类号: G06F12/0817
-
公开(公告)号:EP3547120A1
公开(公告)日:2019-10-02
申请号:EP19157043.1
申请日:2019-02-13
申请人: INTEL Corporation
发明人: HUGHES, Christopher J. , HEINECKE, Alexander F. , VALENTINE, Robert , TOLL, Bret , CORBAL, Jesus , OULD-AHMED-VALL, Elmoustapha
摘要: Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.
-
-
-
-
-
-
-
-
-