Patent search ap:("Arm Limited") AND inv:"Rune Holm" Page 3

21.

发明授权
Neural network including memory elements implemented at nodes 有权

公开(公告)号：US11334788B2

公开(公告)日：2022-05-17

申请号：US16315991

申请日：2017-06-14

Applicant: Arm Limited

Inventor： Shidhartha Das , Rune Holm

IPC: G06N3/06 , G06N3/04 , G11C11/54 , G06N3/063 , G11C13/00 , G06N3/08

Abstract: Broadly speaking, embodiments of the present techniques provide a reconfigurable hardware-based artificial neural network, wherein weights for each neural network node of the artificial neural network are obtained via training performed external to the neural network.

22.

发明申请
DATA PROCESSOR 有权

公开(公告)号：US20250165292A1

公开(公告)日：2025-05-22

申请号：US18512615

申请日：2023-11-17

Applicant: Arm Limited

Inventor： Dominic Hugo Symes , Rune Holm , Thomas Patrik Andreas Olsson

IPC: G06F9/50

Abstract: The present disclosure relates to a data processor for processing data, comprising: a plurality of execution units to execute one or more operations; and a plurality of storage elements to store data for the one or more operations, the data processor being configured to process at least one task, each task to be executed in the form of a directed acyclic graph of operations, wherein each of the operations maps to a corresponding execution unit and each connection between operations in the acyclic graph maps to a corresponding storage element, the data processor further comprising: a plurality of counters; and a control module to control the plurality of counters to: in a first mode, count an operation cycle number associated with each operation of the at least one task, the operation cycle number of an operation being a number of cycles required to complete the operation; and in a second mode, count a unit cycle number associated with one or more execution units, the unit cycle number of an execution unit being an accumulative number of cycles when the execution unit is occupied in use during execution of the at least one task.

23.

发明公开
DATA STORAGE 审中-公开

公开(公告)号：US20240231661A9

公开(公告)日：2024-07-11

申请号：US18485419

申请日：2023-10-12

Applicant: Arm Limited

Inventor： Dominic Hugo Symes , Rune Holm

IPC: G06F3/06

CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0659 , G06F3/0673

Abstract: A processor to obtain mapping data indicative of at least one mapping parameter for a plurality of mapping blocks of a multi-dimensional tensor to be mapped. The at least one mapping parameter is for mapping corresponding elements of each mapping block to the same co-ordinate in at least one selected dimension of the multi-dimensional tensor, such that each mapping block corresponds to the same set of co-ordinates in the at least one selected dimension. A co-ordinate of an element of a block of the multi-dimensional tensor is determined. The element is comprised by a mapping block. A physical address in a storage corresponding to the co-ordinate is determined, based on the co-ordinate. The physical address is utilized in a process comprising an interaction between the block of the multi-dimensional tensor and the storage.

24.

发明公开
DATA STORAGE 审中-公开

公开(公告)号：US20240134553A1

公开(公告)日：2024-04-25

申请号：US18485419

申请日：2023-10-11

Applicant: Arm Limited

Inventor： Dominic Hugo Symes , Rune Holm

IPC: G06F3/06

CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0659 , G06F3/0673

Abstract: A processor to obtain mapping data indicative of at least one mapping parameter for a plurality of mapping blocks of a multi-dimensional tensor to be mapped. The at least one mapping parameter is for mapping corresponding elements of each mapping block to the same co-ordinate in at least one selected dimension of the multi-dimensional tensor, such that each mapping block corresponds to the same set of co-ordinates in the at least one selected dimension. A co-ordinate of an element of a block of the multi-dimensional tensor is determined. The element is comprised by a mapping block. A physical address in a storage corresponding to the co-ordinate is determined, based on the co-ordinate. The physical address is utilized in a process comprising an interaction between the block of the multi-dimensional tensor and the storage.

25.

发明授权
Prefetch strategy control for parallel execution of threads based on one or more characteristics of a stream of program instructions indicative that a data access instruction within a program is scheduled to be executed a plurality of times 有权

公开(公告)号：US11494188B2

公开(公告)日：2022-11-08

申请号：US14061837

申请日：2013-10-24

Applicant: ARM LIMITED

Inventor： Ganesh Suryanarayan Dasika , Rune Holm , David Hennah Mansell

IPC: G06F9/345 , G06F9/38

Abstract: A single instruction multiple thread (SIMT) processor includes execution circuitry, prefetch circuitry and prefetch strategy selection circuitry. The prefetch strategy selection circuitry serves to detect one or more characteristics of a stream of program instructions that are being executed to identify whether or not a given data access instruction within a program will be executed a plurality of times. The prefetch strategy to use is selected from a plurality of selectable prefetch strategies in dependence upon the detection of such detected characteristics.

26.

发明授权
Processor instruction specifying indexed storage region holding control data for swizzle operation 有权

公开(公告)号：US11188331B2

公开(公告)日：2021-11-30

申请号：US16576505

申请日：2019-09-19

Applicant: Arm Limited , Apical Limited

Inventor： Daren Croxford , Michel Patrick Gabriel Emil Iwaniec , Rune Holm , Diego Lopez Recas

IPC: G06F9/30 , G06F9/38

Abstract: A data processing system includes: a processor; a data interface for communication with a control unit, the processor being on one side of the data interface; internal storage accessible by the processor, the internal storage being on the same side of the data interface as the processor; and a register array accessible by the processor and comprising a plurality of registers, each register having a plurality of vector lanes. The storage is arranged to store control data indicating an ordered selection of vector lanes of one or more of the registers. The processor is arranged to, in response to receiving instruction data from a control unit, perform a swizzle operation in which data is selected from one or more source registers in the register array, and transferred to a destination register. The data is selected from vector lanes in accordance with control data stored in the internal storage.

27.

发明申请
NEURAL NETWORK PROCESSING 有权

公开(公告)号：US20210295140A1

公开(公告)日：2021-09-23

申请号：US16826586

申请日：2020-03-23

Applicant: Arm Limited

Inventor： Rune Holm , John Wakefield Brothers, III

IPC: G06N3/063 , G06F7/50 , G06F7/523 , G06F7/535 , G06F17/18

Abstract: A neural network processor is disclosed that includes a combined convolution and pooling circuit that can perform both convolution and pooling operations. The circuit can perform a convolution operation by a multiply circuit determining products of corresponding input feature map and convolution kernel weight values, and an add circuit accumulating the products determined by the multiply circuit in storage. The circuit can perform an average pooling operation by the add circuit accumulating input feature map data values in the storage, a divisor circuit determining a divisor value, and a division circuit dividing the data value accumulated in the storage by the determined divisor value. The circuit can perform a maximum pooling operation by a maximum circuit determining a maximum value of input feature map data values, and storing the determined maximum value in the storage.

28.

发明授权
Apparatus and method for executing a plurality of threads 有权

公开(公告)号：US10908916B2

公开(公告)日：2021-02-02

申请号：US15058389

申请日：2016-03-02

Applicant: ARM LIMITED

Inventor： Timothy Holroyd Glauert , David Hennah Mansell , Rune Holm

IPC: G06F9/38 , G06T1/20

Abstract: An apparatus and method are provided for executing a plurality of threads. The apparatus has processing circuitry arranged to execute the plurality of threads, with each thread executing a program to perform processing operations on thread data. Each thread has a thread identifier, and the thread data includes a value which is dependent on the thread identifier. Value generator circuitry is provided to perform a computation using the thread identifier of a chosen thread in order to generate the above mentioned value for that chosen thread, and to make that value available to the processing circuitry for use by the processing circuitry when executing the chosen thread. Such an arrangement can give rise to significant performance benefits when executing the plurality of threads on the apparatus.

29.

发明授权
Decoding a complex program instruction corresponding to multiple micro-operations 有权

公开(公告)号：US09934037B2

公开(公告)日：2018-04-03

申请号：US14466183

申请日：2014-08-22

Applicant: ARM Limited

Inventor： Rune Holm

IPC: G06F9/22 , G06F9/45 , G06F9/30 , G06F9/38 , G06F9/32

CPC classification number: G06F9/30145 , G06F8/41 , G06F8/52 , G06F9/3017 , G06F9/321 , G06F9/3818 , G06F9/3851 , G06F9/3887

Abstract: A data processing apparatus 2 has processing circuitry 4 which can process multiple parallel threads of processing. A shared instruction decoder 30 decodes program instructions to generate micro-operations to be processed by the processing circuitry 4. The instructions include at least one complex instruction which has multiple micro-operations. Multiple fetch units 8 are provided for fetching the micro-operations generated by the decoder 30 for processing by the processing circuitry 4. Each fetch unit 8 is associated with at least one of the threads. The decoder 30 generates the micro-operations of a complex instruction individually in response to separate decode requests 24 triggered by a fetch unit 8, each decode request 24 identifying which micro-operation of the complex instruction is to be generated by the decoder 30 in response to the decode request 24.

30.

发明授权
Data processing device and method for interleaved storage of data elements 有权
Title translation: 数据处理装置和数据元素交错存储方法

公开(公告)号：US09582419B2

公开(公告)日：2017-02-28

申请号：US14063161

申请日：2013-10-25

Applicant: ARM LIMITED

Inventor： Ganesh Suryanarayan Dasika , Rune Holm , Stephen John Hill

IPC: G06F12/06 , G06F9/00

CPC classification number: G06F12/0607 , G06F9/00 , G06F9/30109 , G06F9/3012 , G06F9/3887 , Y02D10/13

Abstract: A data processing device 100 comprises a plurality of storage circuits 130, 160, which store a plurality of data elements of the bits in an interleaved manner. Data processing device also comprises a consumer 110 with a number of lanes 120. The consumer is able to individually access each of the plurality of storage circuits 130, 160 in order to receive into the lanes 120 either a subset of the plurality of data elements or y bits of each of the plurality of data elements. The consumer 110 is also able to execute a common instruction of each of the plurality of lanes 120. The relationship of the bits is such that b is greater than y and is an integer multiple of y. Each of the plurality of storage circuits 130, 160 stores at most y bits of each of the data elements. Furthermore, each of the storage circuits 130, 160 stores at most y/b of the plurality of data elements. By carrying out the interleaving in this manner, the plurality of storage circuits 130, 160 comprise no more than b/y storage circuits.

Abstract translation: 数据处理设备100包括多个存储电路130,160，其以交错的方式存储位的多个数据元素。数据处理设备还包括具有多个通道120的消费者110.消费者能够单独访问多个存储电路130,160中的每一个，以便接收多个数据元素的子集中的子集120或者， y比特的多个数据元素。消费者110还能够执行多个通道120中的每一个的公共指令。比特的关系使得b大于y并且是y的整数倍。多个存储电路130,160中的每一个存储每个数据元素的最多y位。此外，存储电路130,160中的每一个存储多个数据元素中的至多y / b。通过以这种方式进行交织，多个存储电路130,160包括不超过b / y存储电路。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification