-
11.
公开(公告)号:US10402168B2
公开(公告)日:2019-09-03
申请号:US15283295
申请日:2016-10-01
申请人: Intel Corporation
摘要: A floating point multiply-add unit having inputs coupled to receive a floating point multiplier data element, a floating point multiplicand data element, and a floating point addend data element. The multiply-add unit including a mantissa multiplier to multiply a mantissa of the multiplier data element and a mantissa of the multiplicand data element to calculate a mantissa product. The mantissa multiplier including a most significant bit portion to calculate most significant bits of the mantissa product, and a least significant bit portion to calculate least significant bits of the mantissa product. The mantissa multiplier has a plurality of different possible sizes of the least significant bit portion. Energy consumption reduction logic to selectively reduce energy consumption of the least significant bit portion, but not the most significant bit portion, to cause the least significant bit portion to not calculate the least significant bits of the mantissa product.
-
公开(公告)号:US11200186B2
公开(公告)日:2021-12-14
申请号:US16024854
申请日:2018-06-30
申请人: Intel Corporation
发明人: Kermin E. Fleming, Jr. , Simon C. Steely, Jr. , Kent D. Glossop , Mitchell Diamond , Benjamin Keen , Dennis Bradford , Fabrizio Petrini , Barry Tannenbaum , Yongzhi Zhang
摘要: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
-
公开(公告)号:US10963022B2
公开(公告)日:2021-03-30
申请号:US16862263
申请日:2020-04-29
申请人: Intel Corporation
发明人: Simon C. Steely, Jr. , Richard Dischler , David Bach , Olivier Franza , William J. Butera , Christian Karl , Benjamin Keen , Brian Leung
IPC分类号: H05K1/18 , G06F1/18 , H01L23/538 , G06F9/50 , G06F15/76 , H01L25/065
摘要: Embodiments herein may present an integrated circuit or a computing system having an integrated circuit, where the integrated circuit includes a physical network layer, a physical computing layer, and a physical memory layer, each having a set of dies, and a die including multiple tiles. The physical network layer further includes one or more signal pathways dynamically configurable between multiple pre-defined interconnect topologies for the multiple tiles, where each topology of the multiple pre-defined interconnect topologies corresponds to a communication pattern related to a workload. At least a tile in the physical computing layer is further arranged to move data to another tile in the physical computing layer or a storage cell of the physical memory layer through the one or more signal pathways in the physical network layer. Other embodiments may be described and/or claimed.
-
公开(公告)号:US10445234B2
公开(公告)日:2019-10-15
申请号:US15640533
申请日:2017-07-01
申请人: Intel Corporation
IPC分类号: G06F12/0802 , H03K19/177 , G06F17/50 , G11C7/10 , G06F15/78 , G06F15/80 , G11C8/12
摘要: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In an embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an atomic operation when an incoming operand set arrives at the plurality of processing elements.
-
公开(公告)号:US10416999B2
公开(公告)日:2019-09-17
申请号:US15396395
申请日:2016-12-30
申请人: Intel Corporation
摘要: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a second operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements.
-
公开(公告)号:US10146690B2
公开(公告)日:2018-12-04
申请号:US15180351
申请日:2016-06-13
申请人: Intel Corporation
IPC分类号: G06F12/0831
摘要: In an embodiment, a processor includes a plurality of cores and synchronization logic. The synchronization logic includes circuitry to: receive a first memory request and a second memory request; determine whether the second memory request is in contention with the first memory request; and in response to a determination that the second memory request is in contention with the first memory request, process the second memory request using a non-blocking cache coherence protocol. Other embodiments are described and claimed.
-
公开(公告)号:US09934146B2
公开(公告)日:2018-04-03
申请号:US14498946
申请日:2014-09-26
申请人: INTEL CORPORATION
IPC分类号: G06F12/08 , G06F12/0817 , G06F12/0811
CPC分类号: G06F12/0824 , G06F12/0811 , G06F2212/1024 , G06F2212/1048 , G06F2212/2542
摘要: Methods and apparatuses to control cache line coherency are described. A processor may include a first core having a cache to store a cache line, a second core to send a request for the cache line from the first core, moving logic to cause a move of the cache line between the first core and a memory and to update a tag directory of the move, and cache line coherency logic to create a chain home in the tag directory from the request to cause the cache line to be sent from the tag directory to the second core. A method to control cache line coherency may include creating a chain home in a tag directory from a request for a cache line in a first processor core from a second processor core to cause the cache line to be sent from the tag directory to the second processor core.
-
公开(公告)号:US09734069B2
公开(公告)日:2017-08-15
申请号:US14567026
申请日:2014-12-11
申请人: Intel Corporation
IPC分类号: G06F12/08 , G06F12/084 , G06F12/0815 , G06F12/0817
CPC分类号: G06F12/084 , G06F12/0815 , G06F12/0822 , G06F2212/1021 , G06F2212/281 , Y02D10/13
摘要: Systems and methods for multicast tree-based data distribution in a distributed shared cache. An example processing system comprises: a plurality of processing cores, each processing core communicatively coupled to a cache; a tag directory associated with caches of the plurality of processing cores; a shared cache associated with the tag directory; a processing logic configured, responsive to receiving an invalidate request with respect to a certain cache entry, to: allocate, within the shared cache, a shared cache entry corresponding to the certain cache entry; transmit, to at least one of: a tag directory or a processing core that last accessed the certain entry, an update read request with respect to the certain cache entry; and responsive to receiving an update of the certain cache entry, broadcast the update to at least one of: one or more tag directories or one or more processing cores identified by a tag corresponding to the certain cache entry.
-
公开(公告)号:US11593295B2
公开(公告)日:2023-02-28
申请号:US17550875
申请日:2021-12-14
申请人: Intel Corporation
发明人: Kermin E. Fleming, Jr. , Simon C. Steely, Jr. , Kent D. Glossop , Mitchell Diamond , Benjamin Keen , Dennis Bradford , Fabrizio Petrini , Barry Tannenbaum , Yongzhi Zhang
摘要: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
-
公开(公告)号:US10558575B2
公开(公告)日:2020-02-11
申请号:US15396402
申请日:2016-12-30
申请人: INTEL CORPORATION
IPC分类号: G06F12/08 , G06F9/30 , G06F9/38 , G06F12/0862 , G06F12/0842 , G06F12/0875
摘要: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.
-
-
-
-
-
-
-
-
-