-
公开(公告)号:US20240296010A1
公开(公告)日:2024-09-05
申请号:US18591349
申请日:2024-02-29
Applicant: Graphcore Limited
Inventor: Thomas BROWN
CPC classification number: G06F7/49915 , G06F7/523 , G06F7/556
Abstract: A processing unit is provided with circuitry enabling execution quick evaluation of an exponential function. A multiplier circuit is used to multiply the input operand by log2(e), such that a result for the exponential function may be determined by evaluating 2i+f, where i is an integer part of a fixed-point number and f is a fractional part of the fixed-point number. A lookup table is used for providing an estimate for 2f based on the l MSBs of f. The lookup entries are provided according to a function such that the estimates for 2f are provided without bias towards either zero or infinity in the result. In other words, the maximum multiplicative error for each entry of the lookup table is the same in both negative and positive directions. In this way, statistical errors in the evaluation of a large number of exponential functions may be avoided.
-
公开(公告)号:US12056539B2
公开(公告)日:2024-08-06
申请号:US16928886
申请日:2020-07-14
Applicant: Graphcore Limited
Inventor: Ola Torudbakken , Lorenzo Cevolani
CPC classification number: G06F9/52 , G06F9/522 , G06F9/542 , G06F15/17318 , G06F15/17325 , G06N3/02 , G06N3/04 , G06N3/08
Abstract: A data processing system comprising a plurality of processing nodes that are arranged to update a model in a parallel manner Each of the processing nodes starts with a different set of updates to model parameters. Each of the processing nodes is configured to perform one or more reduce-scatter collectives so as to exchange and reduce the updates. Having done so, each processing node is configured to apply the reduced set of updates to obtain an updated set of model parameters. The processing nodes then exchange the updated model parameters using an all-gather so that each processing node ends up with the same model parameters at the end of the process.
-
公开(公告)号:US12047486B2
公开(公告)日:2024-07-23
申请号:US17359066
申请日:2021-06-25
Applicant: Graphcore Limited
Inventor: Graham Cunningham
IPC: H04L9/06
CPC classification number: H04L9/0618 , H04L9/0643
Abstract: The device implements a processing pipeline having distinct circuitry for performing encryption/decryption operations and authentication operations and having state stores associated with the respective operations. The state stores store state associated with a given encryption frame, enabling the respective operations to be performed when blocks of data reach that stage in the pipeline. Due to the complexity of operations in a block cipher encryption scheme, the pipeline is deep, which provide the possibility for processing multiple data packets at any one time. The provision of the state stores at the stages in the pipeline at which they are required prevents stalling when a new data packet is received.
-
公开(公告)号:US20240095103A1
公开(公告)日:2024-03-21
申请号:US18458327
申请日:2023-08-30
Applicant: Graphcore Limited
Inventor: Lars Paul HUSE , Uberto GIROLA , Bjorn Dag JOHNSEN
CPC classification number: G06F9/544 , G06F9/3836
Abstract: A read and notify request is issued by a first processing unit to a lock manager on a different chip. A lock manager determines whether a condition specified by the request in relation to a variable for controlling access to a memory buffer is met. If the two are not equal, a notification request is registered until the variable changes. The second processing unit accesses the memory buffer and, when it has finished, updates the variable. If the variable then satisfies the condition specified by the read and notify request, the first processing unit is then notified by the lock manager and accesses the memory buffer. In this way, the first processing unit does not need to continually poll to determine when the variable has changed, but is notified when it is its turn to access the memory buffer.
-
公开(公告)号:US11907725B2
公开(公告)日:2024-02-20
申请号:US18164202
申请日:2023-02-03
Applicant: Graphcore Limited
Inventor: Richard Osborne , Matthew Fyles
CPC classification number: G06F9/3885 , G06F9/3001 , G06F9/4881 , G06F9/522 , G06F15/80 , G06N3/084 , G06N20/00 , H04L45/00
Abstract: A computer comprising a plurality of processors, each of which are configured to perform operations on data during a compute phase for the computer and, following a pre-compiled synchronisation barrier, exchange data with at least one other of the processors during an exchange phase for the computer, wherein of the processors in the computer is indexed and the data exchange operations carried out by each processor in the exchange phase depend upon its index value.
-
公开(公告)号:US11907408B2
公开(公告)日:2024-02-20
申请号:US17215746
申请日:2021-03-29
Applicant: Graphcore Limited
Inventor: Graham Cunningham , Daniel Wilkinson
CPC classification number: G06F21/72 , G06F3/0623 , G06F3/0659 , G06F3/0683 , G06F21/78 , H04L9/0618 , H04L9/0631 , H04L9/0894 , H04L9/14 , H04L9/3242
Abstract: A device comprising a processing unit having a plurality of processors is provided. At least one encryption unit is provided as part of the device for encrypting data written by the processors to external storage and decrypting data read from that storage. The processors are divided into different sets, with state information held in the encryption unit for performing encryption/decryption operations for requests for different sets of processors. This enables interleaved read completions or write requests from different sets of processors to be handled by the encryption unit, since associated state information for each set of processors is independently maintained.
-
公开(公告)号:US11886934B2
公开(公告)日:2024-01-30
申请号:US16928782
申请日:2020-07-14
Applicant: Graphcore Limited
Inventor: Lorenzo Cevolani , Fabian Tschopp , Ola Torudbakken
CPC classification number: G06F9/52 , G06F9/522 , G06F9/542 , G06F15/17318 , G06F15/17325 , G06N3/02 , G06N3/04 , G06N3/08
Abstract: A data processing system comprising a plurality of processing nodes, each comprising at least one memory configured to store an array of data items, wherein each of the plurality of processing nodes is configured to execute compute instructions during a compute phase and following a precompiled synchronisation barrier, enter at least one exchange phase. During the at least one exchange phase, a series of collective operations are carried out. Each processing node is configured to perform a reduce scatter collective in at least one first dimension. Using the results of the reduce scatter collective, each processing node performs an allreduce in a second dimension. The processing nodes then perform an all-gather collective in the at least one first dimension using the results of the allreduce.
-
公开(公告)号:US11847455B2
公开(公告)日:2023-12-19
申请号:US17345186
申请日:2021-06-11
Applicant: Graphcore Limited
Inventor: Jonathan Louis Ferguson
CPC classification number: G06F9/30141 , G06F9/3016 , G06F15/7807
Abstract: A processing unit having a register file includes: a plurality of registers each having a write enable input configured to receive a write enable signal and a write data input connected to a write data path of the processing unit and configured to write data values from the write data path for storage in a register when the write enable signal is asserted; write circuitry configured in a normal mode of operation to assert the write enable signal of a respective one of the registers to cause operational data values to be written to that register from the write data path; and data cleansing circuitry configured to control a data cleansing mode in which write enable signals of all registers in the register file are simultaneously asserted to cause cleansing data values to be simultaneously written to all registers in the register file from the write data path.
-
公开(公告)号:US11775415B2
公开(公告)日:2023-10-03
申请号:US16527454
申请日:2019-07-31
Applicant: Graphcore Limited
Inventor: Alan Graham Alexander , Graham Bernard Cunningham
CPC classification number: G06F11/366 , G06F9/30101 , G06F9/30123 , G06F9/3802 , G06F9/3814 , G06F9/3851 , G06F9/3867 , G06F9/48
Abstract: A processor comprising at least one processing module, each processing module comprising: an execution pipeline; memory; an instruction fetch unit comprising operable to switch between an operational mode and a debugging mode, the instruction fetch unit being configured so as, when in the operational mode, to fetch machine code instructions from the memory into the execution pipeline to be executed; and a debug interface for connecting to a debug adapter. The debug interface comprises a debug instruction register enabling the debug adapter to write a machine code instruction to the debug instruction register, and wherein the instruction fetch unit is configured so as, when in the debug mode, to fetch instructions from the debug instruction register into the pipeline instead of from the memory.
-
公开(公告)号:US20230281144A1
公开(公告)日:2023-09-07
申请号:US17658944
申请日:2022-04-12
Applicant: Graphcore Limited
Inventor: Daniel WILKINSON , Stephen FELIX , Simon KNOWLES , Graham CUNNINGHAM , David LACEY
CPC classification number: G06F13/4022 , G06F9/30079 , G06F9/522 , G06F13/4027
Abstract: A processing device has a plurality of interfaces and a plurality of processors. During different phases of execution of a computer program, different processors are associated with different interfaces, such that the connectivity between processors and interfaces for the sending of egress data and the receiving of ingress data may change during execution of that computer program. The change in this connectivity is directed by the compiled code running on the processors. The compiled code selects which buses associated with which interfaces, given processors are to connect to for receipt of ingress data. Furthermore, the compiled code causes control messages to be sent to circuitry associated with the interfaces, so as to control which buses associated with which processors, given interfaces are to connect to.
-
-
-
-
-
-
-
-
-