-
公开(公告)号:US10983793B2
公开(公告)日:2021-04-20
申请号:US16369846
申请日:2019-03-29
申请人: INTEL CORPORATION
发明人: Joshua Fryman , Ankit More , Jason Howard , Robert Pawlowski , Yigit Demir , Nick Pepperling , Fabrizio Petrini , Sriram Aananthakrishnan , Shaden Smith
摘要: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.
-
2.
公开(公告)号:US20240119015A1
公开(公告)日:2024-04-11
申请号:US18458462
申请日:2023-08-30
申请人: Intel Corporation
发明人: Shruti Sharma , Robert Pawlowski
CPC分类号: G06F13/1673 , G06F9/526
摘要: Systems, apparatuses and methods may provide for technology that detects a condition in which a plurality of atomic instructions target a common address and different bit positions in a mask, generates a combined read-lock request for the plurality of atomic instructions in response to the condition, and sends the combined read-lock request to a lock buffer coupled to a memory device associated with the common address.
-
3.
公开(公告)号:US20240020253A1
公开(公告)日:2024-01-18
申请号:US18477787
申请日:2023-09-29
申请人: Intel Corporation
IPC分类号: G06F13/28
CPC分类号: G06F13/28 , G06F2213/28
摘要: Systems, apparatuses and methods may provide for technology that detects a plurality of sub-instruction requests from a first memory engine in a plurality of memory engines, wherein the plurality of sub-instruction requests are associated with a direct memory access (DMA) data type conversion request from a first pipeline, wherein each sub-instruction request corresponds to a data element in the DMA data type conversion request, and wherein the first memory engine is to correspond to the first pipeline, decodes the plurality of sub-instruction requests to identify one or more arguments, loads a source array from a dynamic random access memory (DRAM) in a plurality of DRAMs, wherein the operation engine is to correspond to the DRAM, and conducts a conversion of the source array from a first data type to a second data type in accordance with the one or more arguments.
-
4.
公开(公告)号:US10795819B1
公开(公告)日:2020-10-06
申请号:US16453670
申请日:2019-06-26
申请人: Intel Corporation
发明人: Robert Pawlowski , Bharadwaj Krishnamurthy , Vincent Cave , Jason M. Howard , Ankit More , Joshua B. Fryman
IPC分类号: G06F12/00 , G06F12/0817 , G06F12/0811 , G06F9/38 , G06F9/30 , G06F12/0891
摘要: Disclosed embodiments relate to a system with configurable cache sub-domains and cross-die memory coherency. In one example, a system includes R racks, each rack housing N nodes, each node incorporating D dies, each die containing C cores and a die shadow tag, each core including P pipelines and a core shadow tag, each pipelines associated with a data cache and data cache tags and being either non-coherent or coherent and one of X coherency domains, wherein each pipeline, when needing to read a cache line, issues a read request to its associated data cache, then, if need be, issues a read request to its associated core-level cache, then, if need be, issues a read request to its associated die-level cache, then, if need be, issues a no-cache remote read request to a target die being mapped to hold the cache line.
-
5.
公开(公告)号:US20230333998A1
公开(公告)日:2023-10-19
申请号:US18312752
申请日:2023-05-05
申请人: Intel Corporation
IPC分类号: G06F13/28
CPC分类号: G06F13/28
摘要: Systems, apparatuses and methods may provide for technology that includes a plurality of memory engines corresponding to a plurality of pipelines, wherein each memory engine in the plurality of memory engines is adjacent to a pipeline in the plurality of pipelines, and wherein a first memory engine is to request one or more direct memory access (DMA) operations associated with a first pipeline, and a plurality of operation engines corresponding to a plurality of dynamic random access memories (DRAMs), wherein each operation engine in the plurality of operation engines is adjacent to a DRAM in the plurality of DRAMs, and wherein one or more of the plurality of operation engines is to conduct the one or more DMA operations based on one or more bitmaps.
-
公开(公告)号:US11630691B2
公开(公告)日:2023-04-18
申请号:US17410818
申请日:2021-08-24
申请人: Intel Corporation
发明人: Robert Pawlowski , Ankit More , Jason M. Howard , Joshua B. Fryman , Tina C. Zhong , Shaden Smith , Sowmya Pitchaimoorthy , Samkit Jain , Vincent Cave , Sriram Aananthakrishnan , Bharadwaj Krishnamurthy
摘要: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.
-
公开(公告)号:US11360809B2
公开(公告)日:2022-06-14
申请号:US16024343
申请日:2018-06-29
申请人: Intel Corporation
发明人: William Paul Griffin , Joshua Fryman , Jason Howard , Sang Phill Park , Robert Pawlowski , Michael Abbott , Scott Cline , Samkit Jain , Ankit More , Vincent Cave , Fabrizio Petrini , Ivan Ganev
摘要: Embodiments of apparatuses, methods, and systems for scheduling tasks to hardware threads are described. In an embodiment, a processor includes a multiple hardware threads and a task manager. The task manager is to issue a task to a hardware thread. The task manager includes a hardware task queue to store a descriptor for the task. The descriptor is to include a field to store a value to indicate whether the task is a single task, a collection of iterative tasks, and a linked list of tasks.
-
公开(公告)号:US10476492B2
公开(公告)日:2019-11-12
申请号:US16201915
申请日:2018-11-27
申请人: Intel Corporation
IPC分类号: H03K17/00 , G11C7/10 , H03K19/173
摘要: Embodiments herein may present an integrated circuit including a switch, where the switch together with other switches forms a network of switches to perform a sequence of operations according to a structure of a collective tree. The switch includes a first number of input ports, a second number of output ports, a configurable crossbar to selectively couple the first number of input ports to the second number of output ports, and a computation engine coupled to the first number of input ports, the second number of output ports, and the crossbar. The computation engine of the switch performs an operation corresponding to an operation represented by a node of the collective tree. The switch further includes one or more registers to selectively configure the first number of input ports and the configurable crossbar. Other embodiments may be described and/or claimed.
-
公开(公告)号:US20240241645A1
公开(公告)日:2024-07-18
申请号:US18621437
申请日:2024-03-29
申请人: Intel Corporation
发明人: Robert Pawlowski , Shruti Sharma , Fabio Checconi , Sriram Aananthakrishnan , Jesmin Jahan Tithi , Jordi Wolfson-Pou , Joshua B. Fryman
IPC分类号: G06F3/06
CPC分类号: G06F3/0613 , G06F3/0656 , G06F3/0673
摘要: Systems, apparatuses and methods may provide for technology that includes a plurality of hash management buffers corresponding to a plurality of pipelines, wherein each hash management buffer in the plurality of hash management buffers is adjacent to a pipeline in the plurality of pipelines, and wherein a first hash management buffer is to issue one or more hash packets associated with one or more hash operations on a hash table. The technology may also include a plurality of hash engines corresponding to a plurality of dynamic random access memories (DRAMs), wherein each hash engine in the plurality of hash engines is adjacent to a DRAM in the plurality of DRAMs, and wherein one or more of the hash engines is to initialize a target memory destination associated with the hash table and conduct the one or more hash operations in response to the one or more hash packets.
-
公开(公告)号:US11960922B2
公开(公告)日:2024-04-16
申请号:US17030999
申请日:2020-09-24
申请人: Intel Corporation
IPC分类号: G06F9/46 , G06F9/30 , G06F9/38 , G06Q10/101
CPC分类号: G06F9/466 , G06F9/3004 , G06F9/30043 , G06F9/3834 , G06F2212/452 , G06Q10/101
摘要: In an embodiment, a processor comprises: an execution circuit to execute instructions; at least one cache memory coupled to the execution circuit; and a table storage element coupled to the at least one cache memory, the table storage element to store a plurality of entries each to store object metadata of an object used in a code sequence. The processor is to use the object metadata to provide user space multi-object transactional atomic operation of the code sequence. Other embodiments are described and claimed.
-
-
-
-
-
-
-
-
-