Patent search ap:("Graphcore Limited") AND inv:"Richard Luke Southwell Osborne" Page 3

21.

发明申请
SENDING DATA OFF-CHIP 审中-公开

公开(公告)号：US20190155768A1

公开(公告)日：2019-05-23

申请号：US16165607

申请日：2018-10-19

Applicant: Graphcore Limited

Inventor： Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Graham Bernard Cunningham , Alan Graham Alexander

IPC: G06F13/20 , G06F13/42

Abstract: A processor comprising multiple tiles on the same chip, and an external interconnect for communicating data off-chip in the form of packets. The external interconnect comprises an external exchange block configured to provide flow control and queuing of the packets. One of the tiles is nominated by the compiler to send an external exchange request message to the exchange block on behalf of others with data to send externally. The exchange sends an exchange-on message to a first of these tiles, to cause the first tile to start sending packets via the external interconnect. Then, once this tile has sent its last data packet, the exchange block sends an exchange-off control packet to this tile to cause it to stop sending packets, and sends another exchange-on message to the next tile with data to send, and so forth.

22.

发明申请
COMPILER METHOD 审中-公开

公开(公告)号：US20190121388A1

公开(公告)日：2019-04-25

申请号：US15886053

申请日：2018-02-01

Applicant: Graphcore Limited

Inventor： Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey

IPC: G06F1/12 , G06F9/52 , G06F9/30

Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal a send instruction to transmit at least one data packet at a predetermined transmit time, relative to the synchronisation signal, destined for a recipient processing unit but having no destination identifier, and a local program allocated to the recipient processing unit is scheduled to execute at a predetermined switch time a switch control instruction to control the switching circuitry to connect its processing unit wire to the switching fabric to receive the data packet at a receive time.

23.

发明授权
Instruction set 有权

公开(公告)号：US12141092B2

公开(公告)日：2024-11-12

申请号：US17658124

申请日：2022-04-06

Applicant: Graphcore Limited

Inventor： Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey

IPC: G06F15/173 , G06F8/41 , G06F9/38 , G06F15/80

Abstract: The invention relates to a computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising one or more computer executable instruction which, when executed, implements: a send function which causes a data packet destined for a recipient processing unit to be transmitted on a set of connection wires connected to the processing unit, the data packet having no destination identifier but being transmitted at a predetermined transmit time; and a switch control function which causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined receive time.

24.

发明授权
Sharing data structures 有权

公开(公告)号：US11675572B2

公开(公告)日：2023-06-13

申请号：US17375406

申请日：2021-07-14

Applicant: Graphcore Limited

Inventor： Richard Luke Southwell Osborne

IPC: G06F9/44 , G06F8/41 , G06F16/901

CPC classification number: G06F8/41 , G06F16/9024

Abstract: In a computer comprising multiple processing units, a method of exchanging read only elements between the processing units is described. The read only elements may be code or data, such as vector or matrix data for an AI graph. A master processing unit is identified. At compile time, at least one shareable read only element is allocated to the master processing unit. The at least one shareable read only element is stored in the local memory of the master processing unit. At compile time a transmitting exchange code sequence designated to be executed at the execution stage of the master processing unit is also allocated to the master processing unit. At a time point determined at compile time, the transmitting exchange code sequence causes the processing unit to identify the shareable read only element and to generate a message to be transmitted for reception by another processing unit, the message comprising the shareable read only data element.

25.

发明授权
Compiler method 有权

公开(公告)号：US11262787B2

公开(公告)日：2022-03-01

申请号：US16744249

申请日：2020-01-16

Applicant: Graphcore Limited

Inventor： Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey

IPC: G06F1/12 , G06F9/52 , G06F9/30 , G06F9/54 , G06F9/38 , G06F15/173 , G06N20/00

Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal a send instruction to transmit at least one data packet at a predetermined transmit time, relative to the synchronisation signal, destined for a recipient processing unit but having no destination identifier, and a local program allocated to the recipient processing unit is scheduled to execute at a predetermined switch time a switch control instruction to control the switching circuitry to connect its processing unit wire to the switching fabric to receive the data packet at a receive time.

26.

发明授权
Streaming engine 有权

公开(公告)号：US11237882B2

公开(公告)日：2022-02-01

申请号：US16235515

申请日：2018-12-28

Applicant: Graphcore Limited

Inventor： Ola Tørudbakken , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Brian Manula , Harald Høeg

IPC: G06F9/46 , G06F9/52 , G06N20/00 , G06F3/06 , G06N5/02

Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host. The gateway enables the transfer of batches of data to the subsystem at precompiled data exchange synchronisation points. The gateway comprises a streaming engine having a data mover engine and a memory management engine, the data mover engine and memory management engine being configured to execute instructions in coordination from work descriptors. The memory management engine is configured to execute instructions from the work descriptor to transfer data between external storage and the local memory associated with the gateway. The data mover engine is configured to execute instructions from the work descriptor to transfer data between the local memory associated with the gateway and the subsystem.

27.

发明授权
Host proxy on gateway 有权

公开(公告)号：US10970131B2

公开(公告)日：2021-04-06

申请号：US16235265

申请日：2018-12-28

Applicant: Graphcore Limited

Inventor： Ola Tørudbakken , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Matthew David Fyles , Brian Manula , Harald Høeg

IPC: G06F9/52 , G06F16/901 , G06F9/30 , G06F9/38 , G06F9/54 , G06F15/167 , G06F15/173 , H04L12/801 , H04L29/06 , H04L29/08 , G06F9/48

Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.

28.

发明授权
Synchronization in a multi-tile processing array 有权

公开(公告)号：US10936008B2

公开(公告)日：2021-03-02

申请号：US15886009

申请日：2018-02-01

Applicant: Graphcore Limited

Inventor： Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey

IPC: G06F1/12 , G06F3/06 , G06F13/40

Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection wires, the data packet being destined for at least one recipient processing unit but having no destination identifier, and at a predetermined switch time the recipient processing unit executes a switch control instruction from its local program to control its switching circuitry to connect its input set of wires to the switching fabric to receive the data packet at a receive time, the transmit time and, switch time and receive time being governed by the common clock with respect to the synchronisation signal.

29.

发明授权
Synchronization in a multi-tile, multi-chip processing arrangement 有权

公开(公告)号：US10579585B2

公开(公告)日：2020-03-03

申请号：US15886138

申请日：2018-02-01

Applicant: Graphcore Limited

Inventor： Daniel John Pelham Wilkinson , Stephen Felix , Richard Luke Southwell Osborne , Simon Christian Knowles , Alan Graham Alexander , Ian James Quinn

IPC: G06F15/80 , G06F9/52 , G06F15/173

Abstract: A method of operating a system comprising multiple processor tiles divided into a plurality of domains wherein within each domain the tiles are connected to one another via a respective instance of a time-deterministic interconnect and between domains the tiles are connected to one another via a non-time-deterministic interconnect. The method comprises: performing a compute stage, then performing a respective internal barrier synchronization within each domain, then performing an internal exchange phase within each domain, then performing an external barrier synchronization to synchronize between different domains, then performing an external exchange phase between the domains.

30.

发明授权
Sending data off-chip 有权

公开(公告)号：US10558595B2

公开(公告)日：2020-02-11

申请号：US16165607

申请日：2018-10-19

Applicant: Graphcore Limited

Inventor： Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Graham Bernard Cunningham , Alan Graham Alexander

IPC: G06F13/20 , G06F13/42 , G06F15/16 , G06F15/163

Abstract: A processor comprising multiple tiles on the same chip, and an external interconnect for communicating data off-chip in the form of packets. The external interconnect comprises an external exchange block configured to provide flow control and queuing of the packets. One of the tiles is nominated by the compiler to send an external exchange request message to the exchange block on behalf of others with data to send externally. The exchange sends an exchange-on message to a first of these tiles, to cause the first tile to start sending packets via the external interconnect. Then, once this tile has sent its last data packet, the exchange block sends an exchange-off control packet to this tile to cause it to stop sending packets, and sends another exchange-on message to the next tile with data to send, and so forth.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification