-
公开(公告)号:US12153932B2
公开(公告)日:2024-11-26
申请号:US17129555
申请日:2020-12-21
Applicant: Intel Corporation
Inventor: Ankit More , Fabrizio Petrini , Robert Pawlowski , Shruti Sharma , Sowmya Pitchaimoorthy
IPC: G06F9/4401 , G06F13/40
Abstract: Examples include techniques for an in-network acceleration of a parallel prefix-scan operation. Examples include configuring registers of a node included in a plurality of nodes on a same semiconductor package. The registers to be configured responsive to receiving an instruction that indicates a logical tree to map to a network topology that includes the node. The instruction associated with a prefix-scan operation to be executed by at least a portion of the plurality of nodes.
-
公开(公告)号:US11630691B2
公开(公告)日:2023-04-18
申请号:US17410818
申请日:2021-08-24
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Ankit More , Jason M. Howard , Joshua B. Fryman , Tina C. Zhong , Shaden Smith , Sowmya Pitchaimoorthy , Samkit Jain , Vincent Cave , Sriram Aananthakrishnan , Bharadwaj Krishnamurthy
Abstract: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.
-
公开(公告)号:US11106494B2
公开(公告)日:2021-08-31
申请号:US16147302
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Ankit More , Jason M. Howard , Joshua B. Fryman , Tina C. Zhong , Shaden Smith , Sowmya Pitchaimoorthy , Samkit Jain , Vincent Cave , Sriram Aananthakrishnan , Bharadwaj Krishnamurthy
Abstract: Disclosed embodiments relate to an improved memory system architecture for multi-threaded processors. In one example, a system includes a system comprising a multi-threaded processor core (MTPC), the MTPC comprising: P pipelines, each to concurrently process T threads; a crossbar to communicatively couple the P pipelines; a memory for use by the P pipelines, a scheduler to optimize reduction operations by assigning multiple threads to generate results of commutative arithmetic operations, and then accumulate the generated results, and a memory controller (MC) to connect with external storage and other MTPCs, the MC further comprising at least one optimization selected from: an instruction set architecture including a dual-memory operation; a direct memory access (DMA) engine; a buffer to store multiple pending instruction cache requests; multiple channels across which to stripe memory requests; and a shadow-tag coherency management unit.
-
公开(公告)号:US11061742B2
公开(公告)日:2021-07-13
申请号:US16019685
申请日:2018-06-27
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Ankit More , Shaden Smith , Sowmya Pitchaimoorthy , Samkit Jain , Vincent Cavé , Sriram Aananthakrishnan , Jason M. Howard , Joshua B. Fryman
Abstract: In one embodiment, a first processor core includes: a plurality of execution pipelines each to execute instructions of one or more threads; a plurality of pipeline barrier circuits coupled to the plurality of execution pipelines, each of the plurality of pipeline barrier circuits associated with one of the plurality of execution pipelines to maintain status information for a plurality of barrier groups, each of the plurality of barrier groups formed of at least two threads; and a core barrier circuit to control operation of the plurality of pipeline barrier circuits and to inform the plurality of pipeline barrier circuits when a first barrier has been reached by a first barrier group of the plurality of barrier groups. Other embodiments are described and claimed.
-
5.
公开(公告)号:US20200004602A1
公开(公告)日:2020-01-02
申请号:US16019685
申请日:2018-06-27
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Ankit More , Shaden Smith , Sowmya Pitchaimoorthy , Samkit Jain , Vincent Cavé , Sriram Aananthakrishnan , Jason M. Howard , Joshua B. Fryman
Abstract: In one embodiment, a first processor core includes: a plurality of execution pipelines each to execute instructions of one or more threads; a plurality of pipeline barrier circuits coupled to the plurality of execution pipelines, each of the plurality of pipeline barrier circuits associated with one of the plurality of execution pipelines to maintain status information for a plurality of barrier groups, each of the plurality of barrier groups formed of at least two threads; and a core barrier circuit to control operation of the plurality of pipeline barrier circuits and to inform the plurality of pipeline barrier circuits when a first barrier has been reached by a first barrier group of the plurality of barrier groups. Other embodiments are described and claimed.
-
-
-
-