Patent search ap:("ARM Limited") AND inv:"Ganesh Suryanarayan Dasika" Page 1

1.

发明授权
Estimating a number of occupants in a region 有权

公开(公告)号：US10607147B2

公开(公告)日：2020-03-31

申请号：US15182901

申请日：2016-06-15

Applicant: ARM LIMITED

Inventor： Yordan Petrov Raykov , Emre Özer , Ganesh Suryanarayan Dasika

IPC: G06F17/18 , G06F19/00 , G06N7/00 , G06N20/00 , G06Q50/16 , G06Q10/00 , G01J5/00

Abstract: A method for estimating a number of occupants in a region comprises receiving a time series of sensor values detected over a period of time by a motion sensor sensing motion in the region. A spread parameter indicative of the spread of the sensor values is determined. The number of occupants in the region is estimated based on the spread parameter.

2.

发明授权
Data processing method and apparatus for prefetching 有权
Title translation: 预取数据处理方法和装置

公开(公告)号：US09037835B1

公开(公告)日：2015-05-19

申请号：US14061842

申请日：2013-10-24

Applicant: ARM LIMITED

Inventor： Ganesh Suryanarayan Dasika , Rune Holm

IPC: G06F9/00 , G06F12/08 , G06F15/00

CPC classification number: G06F12/0862 , G06F9/3455 , G06F9/383 , G06F9/3832 , G06F2212/6026

Abstract: A data processing device includes processing circuitry 20 for executing a first memory access instruction to a first address of a memory device 40 and a second memory access instruction to a second address of the memory device 40, the first address being different from the second address. The data processing device also includes prefetching circuitry 30 for prefetching data from the memory device 40 based on a stride length 70 and instruction analysis circuitry 50 for determining a difference between the first address and the second address. Stride refining circuitry 60 is also provided to refine the stride length based on factors of the stride length and factors of the difference calculated by the instruction analysis circuitry 50.

Abstract translation: 数据处理设备包括处理电路20，用于执行对存储器件40的第一地址的第一存储器访问指令和到存储器件40的第二地址的第二存储器访问指令，第一地址不同于第二地址。数据处理装置还包括预取电路30，用于基于步幅长度70和指令分析电路50从存储器装置40预取数据，用于确定第一地址和第二地址之间的差异。还提供跨步精炼电路60以基于步幅长度的因素和由指令分析电路50计算的差异的因素来细化步幅长度。

3.

发明申请
ADAPTIVE PREFETCHING IN A DATA PROCESSING APPARATUS 审中-公开
Title translation: 数据处理设备中的自适应预处理

公开(公告)号：US20150134933A1

公开(公告)日：2015-05-14

申请号：US14080139

申请日：2013-11-14

Applicant: ARM Limited

Inventor： Rune HOLM , Ganesh Suryanarayan Dasika

IPC: G06F9/38

CPC classification number: G06F9/3455 , G06F9/3802 , G06F9/383 , G06F9/3832 , G06F12/0862

Abstract: A data processing apparatus and method of data processing are disclosed. An instruction execution unit executes a sequence of program instructions, wherein execution of at least some of the program instructions initiates memory access requests to retrieve data values from a memory. A prefetch unit prefetches data values from the memory for storage in a cache unit before they are requested by the instruction execution unit. The prefetch unit is configured to perform a miss response comprising increasing a number of the future data values which it prefetches, when a memory access request specifies a pending data value which is already subject to prefetching but is not yet stored in the cache unit. The prefetch unit is also configured, in response to an inhibition condition being met, to temporarily inhibit the miss response for an inhibition period.

Abstract translation: 公开了一种数据处理装置和数据处理方法。指令执行单元执行程序指令序列，其中至少一些程序指令的执行启动存储器访问请求以从存储器检索数据值。在由指令执行单元请求之前，预取单元从存储器预取数据值以存储在高速缓存单元中。预取单元被配置为当存储器访问请求指定已经经历预取但尚未存储在高速缓存单元中的未决数据值时，执行未命中的响应，包括增加其预取的未来数据值的数量。响应于满足禁止条件，预取单元还被配置为暂时抑制禁止期间的未命中响应。

4.

发明申请
Hybrid Memory Artificial Neural Network Hardware Accelerator 有权

公开(公告)号：US20210295137A1

公开(公告)日：2021-09-23

申请号：US16822640

申请日：2020-03-18

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Shidhartha Das , Ganesh Suryanarayan Dasika

IPC: G06N3/063 , G06N3/04 , G06F12/06

Abstract: The present disclosure advantageously provides a hybrid memory artificial neural network hardware accelerator that includes a communication bus interface, a static memory, a non-refreshed dynamic memory, a controller and a computing engine. The static memory stores at least a portion of an ANN model. The ANN model includes an input layer, one or more hidden layers and an output layer, ANN basis weights, input data and output data. The non-refreshed dynamic memory is configured to store ANN custom weights for the input, hidden and output layers, and output data. For each layer or layer portion, the computing engine generates the ANN custom weights based on the ANN basis weights, stores the ANN custom weights in the non-refreshed dynamic memory, executes the layer or layer portion, based on inputs and the ANN custom weights, to generate layer output data, and stores the layer output data.

5.

发明授权
Skip predictor for pre-trained recurrent neural networks 有权

公开(公告)号：US11663814B2

公开(公告)日：2023-05-30

申请号：US16855681

申请日：2020-04-22

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Jin Tao , Ganesh Suryanarayan Dasika , Jesse Garrett Beu

IPC: G06N3/08 , G06N3/082 , G06F17/18 , G06K9/62 , G06N3/04

CPC classification number: G06N3/082 , G06F17/18 , G06K9/6267 , G06N3/0472

Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.

6.

发明授权
Hardware accelerator for natural language processing applications 有权

公开(公告)号：US11507841B2

公开(公告)日：2022-11-22

申请号：US16786096

申请日：2020-02-10

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Ganesh Suryanarayan Dasika

IPC: G06N3/063 , G06T1/20 , G06N3/08 , G06F40/40 , G06T1/60

Abstract: The present disclosure advantageously provides a hardware accelerator for a natural language processing application including a first memory, a second memory, and a computing engine (CE). The first memory is configured to store a configurable NLM and a set of NLM fixed weights. The second memory is configured to store an ANN model, a set of ANN weights, a set of NLM delta weights, input data and output data. The set of NLM delta weights may be smaller than the set of NLM fixed weights, and each NLM delta weight corresponds to an NLM fixed weight. The CE is configured to execute the NLM, based on the input data, the set of NLM fixed weights and the set of NLM delta weights, to generate intermediate output data, and execute the ANN model, based on the intermediate output data and the set of ANN weights, to generate the output data.

7.

发明授权
Prefetch strategy control for parallel execution of threads based on one or more characteristics of a stream of program instructions indicative that a data access instruction within a program is scheduled to be executed a plurality of times 有权

公开(公告)号：US11494188B2

公开(公告)日：2022-11-08

申请号：US14061837

申请日：2013-10-24

Applicant: ARM LIMITED

Inventor： Ganesh Suryanarayan Dasika , Rune Holm , David Hennah Mansell

IPC: G06F9/345 , G06F9/38

Abstract: A single instruction multiple thread (SIMT) processor includes execution circuitry, prefetch circuitry and prefetch strategy selection circuitry. The prefetch strategy selection circuitry serves to detect one or more characteristics of a stream of program instructions that are being executed to identify whether or not a given data access instruction within a program will be executed a plurality of times. The prefetch strategy to use is selected from a plurality of selectable prefetch strategies in dependence upon the detection of such detected characteristics.

8.

发明授权
Data processing device and method for interleaved storage of data elements 有权
Title translation: 数据处理装置和数据元素交错存储方法

公开(公告)号：US09582419B2

公开(公告)日：2017-02-28

申请号：US14063161

申请日：2013-10-25

Applicant: ARM LIMITED

Inventor： Ganesh Suryanarayan Dasika , Rune Holm , Stephen John Hill

IPC: G06F12/06 , G06F9/00

CPC classification number: G06F12/0607 , G06F9/00 , G06F9/30109 , G06F9/3012 , G06F9/3887 , Y02D10/13

Abstract: A data processing device 100 comprises a plurality of storage circuits 130, 160, which store a plurality of data elements of the bits in an interleaved manner. Data processing device also comprises a consumer 110 with a number of lanes 120. The consumer is able to individually access each of the plurality of storage circuits 130, 160 in order to receive into the lanes 120 either a subset of the plurality of data elements or y bits of each of the plurality of data elements. The consumer 110 is also able to execute a common instruction of each of the plurality of lanes 120. The relationship of the bits is such that b is greater than y and is an integer multiple of y. Each of the plurality of storage circuits 130, 160 stores at most y bits of each of the data elements. Furthermore, each of the storage circuits 130, 160 stores at most y/b of the plurality of data elements. By carrying out the interleaving in this manner, the plurality of storage circuits 130, 160 comprise no more than b/y storage circuits.

Abstract translation: 数据处理设备100包括多个存储电路130,160，其以交错的方式存储位的多个数据元素。数据处理设备还包括具有多个通道120的消费者110.消费者能够单独访问多个存储电路130,160中的每一个，以便接收多个数据元素的子集中的子集120或者， y比特的多个数据元素。消费者110还能够执行多个通道120中的每一个的公共指令。比特的关系使得b大于y并且是y的整数倍。多个存储电路130,160中的每一个存储每个数据元素的最多y位。此外，存储电路130,160中的每一个存储多个数据元素中的至多y / b。通过以这种方式进行交织，多个存储电路130,160包括不超过b / y存储电路。

9.

发明授权
Hybrid memory artificial neural network hardware accelerator 有权

公开(公告)号：US11468305B2

公开(公告)日：2022-10-11

申请号：US16822640

申请日：2020-03-18

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Shidhartha Das , Ganesh Suryanarayan Dasika

IPC: G06N3/063 , G06F12/06 , G06N3/04

Abstract: The present disclosure advantageously provides a hybrid memory artificial neural network hardware accelerator that includes a communication bus interface, a static memory, a non-refreshed dynamic memory, a controller and a computing engine. The static memory stores at least a portion of an ANN model. The ANN model includes an input layer, one or more hidden layers and an output layer, ANN basis weights, input data and output data. The non-refreshed dynamic memory is configured to store ANN custom weights for the input, hidden and output layers, and output data. For each layer or layer portion, the computing engine generates the ANN custom weights based on the ANN basis weights, stores the ANN custom weights in the non-refreshed dynamic memory, executes the layer or layer portion, based on inputs and the ANN custom weights, to generate layer output data, and stores the layer output data.

10.

发明申请
Hardware Accelerator for Natural Language Processing Applications 有权

公开(公告)号：US20210248008A1

公开(公告)日：2021-08-12

申请号：US16786096

申请日：2020-02-10

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Ganesh Suryanarayan Dasika

IPC: G06F9/50 , G06N3/063 , G06N3/08 , G06F40/40

Abstract: The present disclosure advantageously provides a hardware accelerator for a natural language processing application including a first memory, a second memory, and a computing engine (CE). The first memory is configured to store a configurable NLM and a set of NLM fixed weights. The second memory is configured to store an ANN model, a set of ANN weights, a set of NLM delta weights, input data and output data. The set of NLM delta weights may be smaller than the set of NLM fixed weights, and each NLM delta weight corresponds to an NLM fixed weight. The CE is configured to execute the NLM, based on the input data, the set of NLM fixed weights and the set of NLM delta weights, to generate intermediate output data, and execute the ANN model, based on the intermediate output data and the set of ANN weights, to generate the output data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification