-
公开(公告)号:US20200210175A1
公开(公告)日:2020-07-02
申请号:US16277022
申请日:2019-02-15
Applicant: Graphcore Limited
IPC: G06F9/30
Abstract: A processor comprising a barrel-threaded execution unit for executing concurrent threads, and one or more register files comprising a respective set of context registers for each concurrent thread. One of the register files further comprises a set of shared weights registers common to some or all of the concurrent threads. The types of instruction defined in the instruction set of the processor include an arithmetic instruction having operands specifying a source and a destination from amongst a respective set of arithmetic registers of the thread in which the arithmetic instruction is executed. The execution unit is configured so as, in response to the opcode of the arithmetic instruction, to perform an operation comprising multiplying an input from the source by at least one of the weights from at least one of the shared weights registers, and to place a result in the destination.
-
公开(公告)号:US10606641B2
公开(公告)日:2020-03-31
申请号:US16166000
申请日:2018-10-19
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles
Abstract: A processor comprising: an execution unit for executing a respective thread in each of a repeating sequence of time slots; and a plurality of context register sets, each comprising a respective set of registers for representing a state of a respective thread. The context register sets comprise a respective worker context register set for each of the number of time slots the execution unit is operable to interleave, and at least one extra context register set. The worker context register sets represent the respective states of worker threads and the extra context register set being represents the state of a supervisor thread. The processor is configured to begin running the supervisor thread in each of the time slots, and to enable the supervisor thread to then individually relinquish each of the time slots in which it is running to a respective one of the worker threads.
-
公开(公告)号:US10579585B2
公开(公告)日:2020-03-03
申请号:US15886138
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Stephen Felix , Richard Luke Southwell Osborne , Simon Christian Knowles , Alan Graham Alexander , Ian James Quinn
IPC: G06F15/80 , G06F9/52 , G06F15/173
Abstract: A method of operating a system comprising multiple processor tiles divided into a plurality of domains wherein within each domain the tiles are connected to one another via a respective instance of a time-deterministic interconnect and between domains the tiles are connected to one another via a non-time-deterministic interconnect. The method comprises: performing a compute stage, then performing a respective internal barrier synchronization within each domain, then performing an internal exchange phase within each domain, then performing an external barrier synchronization to synchronize between different domains, then performing an external exchange phase between the domains.
-
公开(公告)号:US10564970B2
公开(公告)日:2020-02-18
申请号:US15886099
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Alan Graham Alexander
Abstract: A processing system comprising multiple tiles and an interconnect between the tiles. The interconnect is used to communicate between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each tile in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all tiles in the group have completed the compute phase. Each tile in the group has a local exit state upon completion of the compute phase. The instruction set comprises a synchronization instruction for execution by each tile upon completion of its compute phase to signal a sync request to logic in the interconnect. In response to receiving the sync request from all the tiles in the group, the logic releases the next exchange phase and also makes available an aggregated a state of all the tiles in the group.
-
公开(公告)号:US20190121641A1
公开(公告)日:2019-04-25
申请号:US15886099
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Alan Graham Alexander
Abstract: A processing system comprising multiple tiles and an interconnect between the tiles. The interconnect is used to communicate between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each tile in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all tiles in the group have completed the compute phase. Each tile in the group has a local exit state upon completion of the compute phase. The instruction set comprises a synchronization instruction for execution by each tile upon completion of its compute phase to signal a sync request to logic in the interconnect. In response to receiving the sync request from all the tiles in the group, the logic releases the next exchange phase and also makes available an aggregated a state of all the tiles in the group.
-
-
-
-