-
公开(公告)号:US20240248764A1
公开(公告)日:2024-07-25
申请号:US18316602
申请日:2023-05-12
Applicant: Arm Limited
Inventor: Rune HOLM , Jens OLSON , Elliot Maurice Simon ROSEMARINE , Jared SMOLENS
IPC: G06F9/50
CPC classification number: G06F9/5038 , G06F9/505 , G06F2209/5021
Abstract: A memory unit configured for handling task data, the task data describing a task to be executed as a directed acyclic graph of operations, wherein each operation maps to a corresponding execution unit, and wherein each connection between operations in the acyclic graph maps to a corresponding storage element of the execution unit. The task data defines an operation space representing the dimensions of a multi-dimensional arrangement of the connected operations to be executed represented by the data blocks; the memory unit configured to receive a sequence of processing requests comprising the one or more data blocks with each data block assigned a priority value and comprising a block command. The memory unit is configured to arbitrate between the data blocks based upon the priority value and block command to prioritize the sequence of processing requests and wherein the processing requests include writing data to, or reading data from storage.
-
公开(公告)号:US20240248755A1
公开(公告)日:2024-07-25
申请号:US18099595
申请日:2023-01-20
Applicant: Arm Limited
Inventor: Rune HOLM , Jens OLSON , Jared Corey SMOLENS , Dominic Hugo SYMES , Elliot Maurice Simon ROSEMARINE
CPC classification number: G06F9/4881 , G06F9/3555
Abstract: A processor comprising: a handling unit; a plurality of components each configured to execute a function. The handling unit can receive a task comprising operations on data in a coordinate space having N dimensions, receive a data structure describing execution of the task and comprising a partially ordered set of data items each associated with instructions usable by the plurality of components when executing the task, each data item is associated with a component among the plurality of components, each data item indicates dimensions of the coordinates space for which changes of coordinate causes the function of the associated component to execute, and dimensions of the coordinate space for which changes of coordinate causes the function of the associated component to store data ready to be used by another component. The handling unit iterates over the coordinate space and executes the task using the partially ordered set of data items.
-
公开(公告)号:US20240248754A1
公开(公告)日:2024-07-25
申请号:US18099594
申请日:2023-01-20
Applicant: Arm Limited
Inventor: Elliot Maurice Simon ROSEMARINE , Jared Corey SMOLENS , Rune HOLM , John Wakefield BROTHERS, III , Jens OLSON
IPC: G06F9/48
CPC classification number: G06F9/4881
Abstract: A processor to generate position data indicative of a position within a compressed data stream, wherein, previously, in executing a task, data of the compressed data stream ending at the position has been read by the processor from storage storing the compressed data stream. After reading the data, the processor reads further data of the compressed data stream from the storage, in executing the task, the further data located beyond the position within the compressed data stream. After reading the further data, the processor reads, based on the position data, a portion of the compressed data stream from the storage, in executing the task, starting from the position within the compressed data stream. The processor decompresses the portion of the compressed data stream to generate decompressed data, in executing the task.
-
公开(公告)号:US20240193089A1
公开(公告)日:2024-06-13
申请号:US18063478
申请日:2022-12-08
Applicant: Arm Limited
Inventor: Jens OLSON , Jared Corey SMOLENS
IPC: G06F12/0875
CPC classification number: G06F12/0875 , G06F2212/1024 , G06F2212/221 , G06F2212/452
Abstract: A processor comprising a first storage managed as a circular buffer to store a plurality of data structures. Each data structure comprises: an identifier, a size indicator and first data associated with instructions for execution of a task. The processor is configured for searching for a data structure in the first storage. A data structure subsequent to the tail data structure can be located using a storage address in the first storage of a tail data structure and the size indicator of all data structures preceding the second data structure among the plurality of data structures. When a data structure is found, the task may be executed based at least in part on the first data of the found data structure.
-
公开(公告)号:US20240126602A1
公开(公告)日:2024-04-18
申请号:US17967297
申请日:2022-10-17
Applicant: Arm Limited
Inventor: Jens OLSON , John Wakefield BROTHERS, III
IPC: G06F9/50
CPC classification number: G06F9/5016
Abstract: A processor to execute a plurality of tasks comprising a first task and a second task. At least a part of the first task is to be executed simultaneously with at least a part of the second task. The processor comprises a handling unit to: determine an available portion of a storage available during execution of the part of the first task; determine a mapping between at least one logical address associated with data associated with the part of the second task and a corresponding at least one physical address of the storage corresponding to the available portion; and identify, based on the mapping, the at least one physical address corresponding to the at least one logical address associated with the data, for storing the data in the available portion of the storage.
-
公开(公告)号:US20240248621A1
公开(公告)日:2024-07-25
申请号:US18099627
申请日:2023-01-20
Applicant: Arm Limited
CPC classification number: G06F3/0626 , G06F3/0644 , G06F3/0673 , G06F7/52
Abstract: A processor to generate accumulated data comprising, for an operation cycle: performing an operation on a first bit range of a set of first input data to generate a set of operation data, which is accumulated with stored data within a first storage device. A lowest n bits of the accumulated data are accumulated with first further stored data within a first bit range of a second storage device, and are bit-shifted from the first storage device. Further accumulated data is generated, comprising, for an operation cycle: performing the operation on a second bit range of the set of first input data to generate a further set of operation data, which is accumulated with the stored data within the first storage device. A lowest m bits of the further accumulated data is accumulated with second further stored data within a second bit range of the second storage device.
-
公开(公告)号:US20210304378A1
公开(公告)日:2021-09-30
申请号:US16836440
申请日:2020-03-31
Applicant: Arm Limited
Inventor: Jens OLSON , Suraj SUDHIR
Abstract: A computer-implemented method of providing a filter (F) in a neural processing unit comprises: receiving input corresponding to target dimensions (XT, YT) of the filter; receiving input corresponding to sub-filter dimensions (X1 . . . n′, Y1 . . . n′) of each of a plurality of sub-filters (SF1 . . . n) implementable in the neural processing unit; and defining the filter (F) as a combination of the plurality of sub-filters (SF1 . . . n), the combination having dimensions that equate to the target dimensions (XT, YT), and wherein the sub-filter dimensions (X1 . . . n′, Y1 . . . n′) of at least two of the sub-filters in the combination are unequal.
-
公开(公告)号:US20210303307A1
公开(公告)日:2021-09-30
申请号:US16834833
申请日:2020-03-30
Applicant: Arm Limited
Inventor: Jens OLSON , John Wakefield BROTHERS, III , Jared Corey SMOLENS , Chi-wen CHENG , Daren CROXFORD , Sharjeel SAEED , Dominic Hugo SYMES
Abstract: Herein described is a method of operating an accumulation process in a data processing apparatus. The accumulation process comprises a plurality of accumulations which output a respective plurality of accumulated values, each based on a stored value and a computed value generated by a data processing operation. The method comprises storing a first accumulated value, the first accumulated value being one of said plurality of accumulated values, into a first storage device comprising a plurality of single-bit storage elements; determining that a predetermined trigger has been satisfied with respect to the accumulation process; and in response to the determining, storing at least a portion of a second accumulated value, the second accumulated value being one of said plurality of accumulated values, into a second storage device.
-
公开(公告)号:US20210027148A1
公开(公告)日:2021-01-28
申请号:US16518444
申请日:2019-07-22
Applicant: Arm Limited
Inventor: Lingchuan MENG , John Wakefield BROTHERS, III , Jens OLSON , Jared Corey SMOLENS , Eric KUNZE , Ian Rudolf BRATT
Abstract: A processor arranged to compress neural network activation data comprising an input module for obtaining neural network activation data. The processor also comprises a block creation module arranged to split the neural network activation data into a plurality of blocks; and a metadata generation module for generating metadata associated with at least one of the plurality of blocks. Based on the metadata generated a selection module selects a compression scheme for each of the plurality of blocks, and a compression module for applying the selected compression scheme to the corresponding block to produce compressed neural network activation data. An output module is also provided for outputting the compressed neural network activation data.
-
-
-
-
-
-
-
-