-
公开(公告)号:US20220019257A1
公开(公告)日:2022-01-20
申请号:US17349488
申请日:2021-06-16
Applicant: Graphcore Limited
Inventor: Simon Douglas CHAMBERS , Stephen FELIX , Ian Malcolm KING
Abstract: Two clocks, a fast clock and a slow clock are provided for clocking a processing unit. A plurality of frequency settings, referred to as gears, are defined for the two clock. Each of these gears indicates a maximum frequency for the fast clock and a minimum frequency for the slow clock, such that the gap between the two frequencies may be kept to a manageable level so as to reduce transients upon switching between the two clocks. The system switches between the gears as required. In response to a determination to increase the frequency of the clock signal, a higher gear is selected at which the maximum and minimum frequencies defined for that gear are higher than the previous selected gear.
-
公开(公告)号:US20210373637A1
公开(公告)日:2021-12-02
申请号:US17445219
申请日:2021-08-17
Applicant: Graphcore Limited
Inventor: Stephen FELIX , Mrudula GORE
Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time, the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.
-
33.
公开(公告)号:US20210303510A1
公开(公告)日:2021-09-30
申请号:US17345290
申请日:2021-06-11
Applicant: Graphcore Limited
Inventor: Stephen FELIX , Jonathan MANGNALL
IPC: G06F15/80
Abstract: A method of recording tile identifiers in each of a plurality of tiles of a multitile processor is described. Tiles are arranged in columns, each column having a plurality of processing circuits, each processing circuit comprising one or more tiles, wherein a base processing circuit in each column is connected to a set of processing circuit identifier wires. A base value is generated on each of the set of processing circuit identifier wires for the base processing circuit in each column. At the base processing circuit, the base value on the set of processing circuit identifier wires is read and incremented by one. The incremented value is propagated to a next processing circuit in the column, and at the next processing circuit a unique identifier is recorded by concatenating an identifier of the column and the incremented value.
-
公开(公告)号:US11119559B2
公开(公告)日:2021-09-14
申请号:US16428797
申请日:2019-05-31
Applicant: Graphcore Limited
Inventor: Stephen Felix , Mrudula Gore
IPC: G06F1/324 , G06F1/06 , G06F1/08 , G06F1/3206
Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.
-
公开(公告)号:US20210216321A1
公开(公告)日:2021-07-15
申请号:US16844314
申请日:2020-04-09
Applicant: Graphcore Limited
Inventor: Lars Paul HUSE
Abstract: A data processing system comprising a plurality of processors, wherein each of the processors is configured to perform data transfer operations to transfer outgoing data to one or more others of the processors during a first of the exchange stages; receive incoming data from the one or more others of the processors during the first of the exchange stages; determine further outgoing data in dependence upon at least part of the incoming data; count an amount of at least part the incoming data received during the first of the exchange stages from the one or more others of the processors; and in response to determining that the amount of the at least part of the incoming data received has reached a predefined amount, perform data transfer operations to transfer the further outgoing data to the one or more others of the processors during a second of the exchange stages.
-
公开(公告)号:US20210200602A1
公开(公告)日:2021-07-01
申请号:US17125249
申请日:2020-12-17
Applicant: Graphcore Limited
Inventor: Brian MANULA , Daniel John Pelham WILKINSON
Abstract: A gateway implementing multiple independent sync networks. The independent sync networks can be used to allow for synchronisation between different synchronisation groups of accelerators. The independent sync networks allow synchronisations to be carried out asynchronously and simultaneously. The gateway has sync propagation circuitry that receives a first synchronisation request for an upcoming exchange phase and propagates this sync request through a first sync network. The first synchronisation request is a request for synchronisation between subsystems of a first synchronisation group. The sync propagation circuitry of the gateway also receives a second synchronisation request for a different exchange phase and propagates this sync request through the second sync network. The second synchronisation request is a request for synchronisation between subsystems of a second synchronisation group. The two exchange phases overlap in time. Therefore, the syncs are simultaneous and asynchronous.
-
公开(公告)号:US20210191731A1
公开(公告)日:2021-06-24
申请号:US16840988
申请日:2020-04-06
Applicant: Graphcore Limited
Inventor: Richard OSBORNE , Matthew FYLES
Abstract: A computer comprising a plurality of processors, each of which are configured to perform operations on data during a compute phase for the computer and, following a pre-compiled synchronisation barrier, exchange data with at least one other of the processors during an exchange phase for the computer, wherein of the processors in the computer is indexed and the data exchange operations carried out by each processor in the exchange phase depend upon its index value.
-
公开(公告)号:US20210191488A1
公开(公告)日:2021-06-24
申请号:US16842859
申请日:2020-04-08
Applicant: Graphcore Limited
Inventor: Stephen FELIX , Daniel WILKINSON
IPC: G06F1/30 , G06F1/3206 , G06F1/08
Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly addressing this droop so as to reduce the probability of circuit timing failures. This problem is addressed by provided an apparatus that is configured to detect the droop and react to mitigate the droop. The apparatus includes a frequency divider that is configured to receive an output of a clock signal generator (e.g. a phase locked loop) and produce an output signal in which a predefined fraction of the clock pulses in the output of the clock signal generator are removed from the output signal. By reducing the frequency of the clock signal in this way (as may be understood by examining equation 3) VDD is increased, hence mitigating the voltage droop. This technique provides a fast throttling mechanism that prevents excessive VDD droop across the processor.
-
公开(公告)号:US11023290B2
公开(公告)日:2021-06-01
申请号:US15885972
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Simon Christian Knowles , Matthew David Fyles , Alan Graham Alexander , Stephen Felix
Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
-
公开(公告)号:US10963003B2
公开(公告)日:2021-03-30
申请号:US16165978
申请日:2018-10-19
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey
Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection wires, the data packet being destined for at least one recipient processing unit but having no destination identifier, and at a predetermined switch time the recipient processing unit executes a switch control instruction from its local program to control its switching circuitry to connect its input set of wires to the switching fabric to receive the data packet at a receive time, the transmit time and, switch time and receive time being governed by the common clock with respect to the synchronisation signal.
-
-
-
-
-
-
-
-
-