-
公开(公告)号:US20240037835A1
公开(公告)日:2024-02-01
申请号:US18362439
申请日:2023-07-31
Applicant: ARM Limited
Inventor: Daren CROXFORD , Sharjeel SAEED , Isidoros SIDERIS
CPC classification number: G06T15/005 , G06T1/60
Abstract: There is provided an apparatus configured to operate as a shader core, the shader core configured to perform a complex rendering process comprising a rendering process and a machine learning process, the shader core comprising: one or more tile buffers configured to store data locally to the shader core, wherein during the rendering process, the one or more tile buffers are configured to store rendered fragment data relating to a tile; and during the machine learning process, the one or more tile buffers are configured to store an input feature map, kernel weights or an output feature map relating to the machine learning process.
-
公开(公告)号:US20220222569A1
公开(公告)日:2022-07-14
申请号:US17145804
申请日:2021-01-11
Applicant: Arm Limited
Inventor: Daren CROXFORD , Sharjeel SAEED , Rachel Jean TRIMBLE , Timothy Fawcett MILNER
Abstract: A processing unit is provided which comprises volatile storage for storing machine learning data in binary representation, and a data processing engine communicatively coupled to the volatile storage. The processing unit is configured to selectively invert the bit values in binary representations of portions of the machine learning data when performing storage operations using the volatile storage. A computer-implemented method, and non-transitory computer-readable storage medium comprising instructions for executing the method are also provided. The method comprises receiving a request to perform a storage operation on the volatile storage using the machine learning data and performing the storage operation, including, selecting a portion of the machine learning data and inverting bit values in a binary representation of the selected portion. A computer-implemented method comprising receiving a request to store machine learning data on volatile storage and storing the machine learning data is also provided. Storing the machine learning data includes operating on at least a portion of the machine learning data to prioritize one of two potential bit values.
-
公开(公告)号:US20200175338A1
公开(公告)日:2020-06-04
申请号:US16209505
申请日:2018-12-04
Applicant: Apical Ltd , Arm Limited
Inventor: Daren CROXFORD , Sharjeel SAEED , Sean Tristram LeGuay ELLIS
Abstract: A method of processing input data using a computing system. The method comprises obtaining association data which relates a kernel in a convolutional neural network to one or more known data patterns; conducting analysis of input data for the convolutional neural network to identify whether a region of input data corresponds to at least one of the one or more known data patterns; and determining whether to process the region of input data with the kernel in the convolutional neural network based on the analysis and the association data.
-
公开(公告)号:US20240036949A1
公开(公告)日:2024-02-01
申请号:US18362405
申请日:2023-07-31
Applicant: ARM Limited
Inventor: Daren CROXFORD , Sharjeel SAEED , Isidoros SIDERIS
IPC: G06F9/54 , G06F9/48 , G06F12/0842
CPC classification number: G06F9/542 , G06F9/4843 , G06F12/0842 , G06F2209/543 , G06F2212/62 , G06F2212/60
Abstract: There is provided a processor configured to transfer data to a plurality of processor circuits. The apparatus includes broadcast circuitry that broadcasts first machine learning data to at least a subset of the plurality of processor circuits.
-
公开(公告)号:US20220129321A1
公开(公告)日:2022-04-28
申请号:US17082864
申请日:2020-10-28
Applicant: Apical Limited , Arm Limited
Inventor: Daren CROXFORD , Sharjeel SAEED , Jayavarapu Srinivasa RAO , Aaron DEBATTISTA
Abstract: An information processing apparatus is described for processing a workload. The information processing apparatus comprises a processor and a memory element connected to the processor via a data link. In advance of processing a workload, the information processing apparatus estimates an access time required to transfer an amount of the workload that is to be transferred from the external memory element to the processor, and estimates a processing time for the processor to process the workload. A processing rate characteristic of the processor and/or a data transfer rate between the memory and the processor is set in dependence upon the estimated processing time and estimated access time. Methods for varying a quality of service (QoS) value of requests to the external memory element are also described.
-
公开(公告)号:US20210303974A1
公开(公告)日:2021-09-30
申请号:US16834881
申请日:2020-03-30
Applicant: Arm Limited , Apical Limited
Inventor: Sharjeel SAEED , Aaron DEBATTISTA , Daren CROXFORD
Abstract: A method apparatus and computer readable medium for processing input data using a neural network comprising at least a first layer and a second layer. The method comprising the steps of applying a partitioning scheme to the input data, to partition the input data into a plurality of blocks, each block representing a portion of the input data. At the first layer of the neural network, the blocks of the input data are processed in a first order to generate intermediary data, wherein the intermediary data is partitioned into a plurality of intermediary blocks. At the second layer of the neural network, the intermediary blocks are processed in a second order, wherein the second order differs from the first order.
-
公开(公告)号:US20200090032A1
公开(公告)日:2020-03-19
申请号:US16132015
申请日:2018-09-14
Applicant: Apical Ltd , Arm Limited
Inventor: Daren CROXFORD , Jayavarapu Srinivasa RAO , Sharjeel SAEED
Abstract: A method of compressing kernels comprising detecting a plurality of replicated kernels. The plurality of replicated kernels comprise kernels. The method also comprises generating a composite kernel from the replicated kernels. The composite kernel comprises kernel data and meta data indicative of the rotations applied to the composite kernel data. The method also comprises storing a composite kernel.
-
公开(公告)号:US20250077286A1
公开(公告)日:2025-03-06
申请号:US18456621
申请日:2023-08-28
Applicant: Arm Limited
IPC: G06F9/50
Abstract: An apparatus is provided for improving the use of multiple-issue operations in a data processor. A variable-issue operation can be recognised is being either a single-issue operation or a multiple-issue operation in dependence on the state of the program at runtime. If a variable-issue operation can be scheduled as a multiple-issue operation, then other operations can be scheduled for performance in the same cycle, when they would have otherwise had to be scheduled for a later cycle. As such, more operations can be performed in fewer cycles thus improving code density and improving data processing performance.
-
公开(公告)号:US20210303307A1
公开(公告)日:2021-09-30
申请号:US16834833
申请日:2020-03-30
Applicant: Arm Limited
Inventor: Jens OLSON , John Wakefield BROTHERS, III , Jared Corey SMOLENS , Chi-wen CHENG , Daren CROXFORD , Sharjeel SAEED , Dominic Hugo SYMES
Abstract: Herein described is a method of operating an accumulation process in a data processing apparatus. The accumulation process comprises a plurality of accumulations which output a respective plurality of accumulated values, each based on a stored value and a computed value generated by a data processing operation. The method comprises storing a first accumulated value, the first accumulated value being one of said plurality of accumulated values, into a first storage device comprising a plurality of single-bit storage elements; determining that a predetermined trigger has been satisfied with respect to the accumulation process; and in response to the determining, storing at least a portion of a second accumulated value, the second accumulated value being one of said plurality of accumulated values, into a second storage device.
-
公开(公告)号:US20210064688A1
公开(公告)日:2021-03-04
申请号:US16552548
申请日:2019-08-27
Applicant: Arm Limited , Apical Limited
Inventor: Sharjeel SAEED , Daren CROXFORD , Davide MARANI , Jayavarapu Srinivasa RAO
Abstract: A computer implemented method for performing convolutions between subsets of an input data array and a kernel resulting in subsets of an output data array. The method may include receiving an input data array and using positional data indicating the position of elements of the input data array to determine subsets of the input data array which contains at least one non-zero value data element; performing convolutions between the subsets of the input data array containing at least one non-zero value data element and a kernel to produce output data array subsets; and combining the output data subsets with the positional data to generate output data indicative of a completed output data array.
-
-
-
-
-
-
-
-
-