-
公开(公告)号:US11169777B2
公开(公告)日:2021-11-09
申请号:US16395328
申请日:2019-04-26
Applicant: Graphcore Limited
Inventor: Alan Graham Alexander , Edward Andrews , Stephen Felix , Mrudula Chidambar Gore
Abstract: A method and apparatus for handling overflow conditions resulting from arithmetic operations involving floating point numbers. An indication is stored as part of a thread's context indicating one of two possible modes for handling overflow conditions. In a first mode, a result of an arithmetic operation is set to the limit representable in the floating point format. In a second mode, a result of an arithmetic operation is set to a NaN.
-
公开(公告)号:US20200210192A1
公开(公告)日:2020-07-02
申请号:US16276895
申请日:2019-02-15
Applicant: Graphcore Limited
Inventor: Alan Graham Alexander , Simon Christian Knowles , Mrudula Chidambar Gore , Jonathan Louis Ferguson
IPC: G06F9/38 , G06F9/30 , G06F12/0875
Abstract: A processor comprising: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.
-
公开(公告)号:US11567768B2
公开(公告)日:2023-01-31
申请号:US16276895
申请日:2019-02-15
Applicant: Graphcore Limited
Inventor: Alan Graham Alexander , Simon Christian Knowles , Mrudula Chidambar Gore , Jonathan Louis Ferguson
IPC: G06F9/30 , G06F9/38 , G06F12/0875
Abstract: A processor is disclosed including: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.
-
公开(公告)号:US20200233670A1
公开(公告)日:2020-07-23
申请号:US16389682
申请日:2019-04-19
Applicant: Graphcore Limited
Abstract: A processor comprising an execution unit, memory and one or more register files. The execution unit is configured to execute instances of machine code instructions from an instruction set. The types of instruction defined in the instruction set include a double-load instruction for loading from the memory to at least one of the one or more register files. The execution unit is configured so as, when the load instruction is executed, to perform a first load operation strided by a fixed stride, and a second load operation strided by a variable stride, the variable stride being specified in a variable stride register in one of the one or more register files.
-
公开(公告)号:US20200210187A1
公开(公告)日:2020-07-02
申请号:US16276872
申请日:2019-02-15
Applicant: Graphcore Limited
Abstract: A processor having an instruction set including a load-store instruction having operands specifying, from amongst the registers in at least one register file, a respective destination of each of two load operations, a respective source of a store operation, and a pair of address registers arranged to hold three memory addresses, the three memory addresses being a respective load address for each the two load operations and a respective store address for the store operation. The load-store instruction further includes three immediate stride operands each specifying a respective stride value for each of the two load addresses and one store address, wherein at least some possible values of each immediate stride operand specify the respective stride value by specifying one of a plurality of fields within a stride register in one of the one or more register files, each field holding a different stride value.
-
公开(公告)号:US11467833B2
公开(公告)日:2022-10-11
申请号:US16276872
申请日:2019-02-15
Applicant: Graphcore Limited
Abstract: A processor having an instruction set including a load-store instruction having operands specifying, from amongst the registers in at least one register file, a respective destination of each of two load operations, a respective source of a store operation, and a pair of address registers arranged to hold three memory addresses, the three memory addresses being a respective load address for each of the two load operations and a respective store address for the store operation. The load-store instruction further includes three stride operands each specifying a respective stride value for each of the two load addresses and one store address, wherein at least some possible values of each stride operand specify the respective stride value by specifying one of a plurality of fields within a stride register in one of the one or more register files, each field holding a different stride value.
-
公开(公告)号:US11061679B2
公开(公告)日:2021-07-13
申请号:US16389682
申请日:2019-04-19
Applicant: Graphcore Limited
Abstract: A processor comprising an execution unit, memory and one or more register files. The execution unit is configured to execute instances of machine code instructions from an instruction set. The types of instruction defined in the instruction set include a double-load instruction for loading from the memory to at least one of the one or more register files. The execution unit is configured so as, when the load instruction is executed, to perform a first load operation strided by a fixed stride, and a second load operation strided by a variable stride, the variable stride being specified in a variable stride register in one of the one or more register files.
-
公开(公告)号:US11023239B2
公开(公告)日:2021-06-01
申请号:US16389682
申请日:2019-04-19
Applicant: Graphcore Limited
Abstract: A processor comprising an execution unit, memory and one or more register files. The execution unit is configured to execute instances of machine code instructions from an instruction set. The types of instruction defined in the instruction set include a double-load instruction for loading from the memory to at least one of the one or more register files. The execution unit is configured so as, when the load instruction is executed, to perform a first load operation strided by a fixed stride, and a second load operation strided by a variable stride, the variable stride being specified in a variable stride register in one of the one or more register files.
-
公开(公告)号:US20200210175A1
公开(公告)日:2020-07-02
申请号:US16277022
申请日:2019-02-15
Applicant: Graphcore Limited
IPC: G06F9/30
Abstract: A processor comprising a barrel-threaded execution unit for executing concurrent threads, and one or more register files comprising a respective set of context registers for each concurrent thread. One of the register files further comprises a set of shared weights registers common to some or all of the concurrent threads. The types of instruction defined in the instruction set of the processor include an arithmetic instruction having operands specifying a source and a destination from amongst a respective set of arithmetic registers of the thread in which the arithmetic instruction is executed. The execution unit is configured so as, in response to the opcode of the arithmetic instruction, to perform an operation comprising multiplying an input from the source by at least one of the weights from at least one of the shared weights registers, and to place a result in the destination.
-
-
-
-
-
-
-
-