-
公开(公告)号:US11030783B1
公开(公告)日:2021-06-08
申请号:US16748712
申请日:2020-01-21
Applicant: Arm Limited
Abstract: A graphics processor that performs early depth tests for primitives in respect of patches of a render output, and depth tests for sampling positions of the render output, maintains a per patch depth buffer that stores depth values for patches for use by the patch early depth test and a per sample depth buffer. When processing of a render output is stopped before the render output is finished, the per sample depth values in the per sample depth buffer are written to storage so that those values can be restored, but the per patch depth value information in the per patch depth buffer is discarded. Then, when processing of the render output is resumed, the per sample depth buffer values are loaded into a per sample depth buffer, and the loaded per sample depth buffer values are also used to restore the per patch depth buffer.
-
公开(公告)号:US20210158613A1
公开(公告)日:2021-05-27
申请号:US16697942
申请日:2019-11-27
Applicant: Arm Limited
Abstract: When processing graphics primitives in a graphics processing system, the render output is divided into a plurality of regions for rendering, each region comprising a respective area of the render output. It is determined for which of the plurality of regions of the render output a primitive should be rendered for. Associated state data for rendering the primitive is stored in a “state data” data structure in memory. For each region of the render output it is determined the primitive should be rendered for, a reference to the associated state data for rendering the primitive is stored in a respective, different data structure for each different region of the render output it is determined the primitive should be rendered for.
-
公开(公告)号:US11449729B2
公开(公告)日:2022-09-20
申请号:US16676757
申请日:2019-11-07
Applicant: Arm Limited
Inventor: Lingchuan Meng , Danny Daysang Loh , Ian Rudolf Bratt , Alexander Eugene Chalfin , Tianmu Li
IPC: G06N3/04 , G06N3/06 , G06N3/08 , G06N3/10 , G06F17/15 , G06F17/16 , G06F17/18 , G06F30/18 , G06F30/20 , G06F30/27 , G06F30/33 , G06F30/367
Abstract: The present disclosure advantageously provides a system and a method for convolving data in a quantized convolutional neural network (CNN). The method includes selecting a set of complex interpolation points, generating a set of complex transform matrices based, at least in part, on the set of complex interpolation points, receiving an input volume from a preceding layer of the quantized CNN, performing a complex Winograd convolution on the input volume and at least one filter, using the set of complex transform matrices, to generate an output volume, and sending the output volume to a subsequent layer of the quantized CNN.
-
公开(公告)号:US11210821B2
公开(公告)日:2021-12-28
申请号:US16698030
申请日:2019-11-27
Applicant: Arm Limited
Inventor: Alexander Eugene Chalfin , Andreas Due Engh-Halstvedt , Olof Henrik Uhrenholt , Andreas Loeve Selvik
Abstract: When processing graphics primitives in a graphics processing system, the render output is divided into a plurality of regions for rendering, each region comprising a respective area of the render output. It is determined for which of the plurality of regions of the render output a primitive should be rendered for. For each region of the render output it is determined a primitive should be rendered for, geometry data for the primitive is stored in memory in a respective data structure for the region in a compressed form, such that the geometry data for the primitive to be rendered is stored in a compressed form, in a respective, different data structure for each different region of the render output it is determined the primitive should be rendered for.
-
公开(公告)号:US11127187B2
公开(公告)日:2021-09-21
申请号:US16697984
申请日:2019-11-27
Applicant: Arm Limited
Inventor: Ian Rudolf Bratt , Andreas Due Engh-Halstvedt , Alexander Eugene Chalfin , Andreas Loeve Selvik , Olof Henrik Uhrenholt , Thomas J. Olson
Abstract: When processing graphics primitives in a graphics processing system, the render output is divided into a plurality of regions (40) for rendering, each region (40) comprising a respective area of the render output; and for sets of one or more primitives to be rendered, it is determined for which of the plurality of regions of the render output (40) the primitive(s) should be rendered; and for each region of the render output (40) it is determined the primitive(s) should be rendered for, geometry data for the primitive(s) is stored in memory in a respective data structure (42) along with an indication of state data that is to be used for rendering the primitive(s) for the region, such that the geometry data for the primitive(s) to be rendered is stored in a respective, different data structure (42) for each different region of the render output (40) it is determined the primitive(s) should be rendered for.
-
公开(公告)号:US20210158585A1
公开(公告)日:2021-05-27
申请号:US16698030
申请日:2019-11-27
Applicant: Arm Limited
Inventor: Alexander Eugene Chalfin , Andreas Due Engh-Halstvedt , Olof Henrik Uhrenholt , Andreas Loeve Selvik
Abstract: When processing graphics primitives in a graphics processing system, the render output is divided into a plurality of regions for rendering, each region comprising a respective area of the render output. It is determined for which of the plurality of regions of the render output a primitive should be rendered for. For each region of the render output it is determined a primitive should be rendered for, geometry data for the primitive is stored in memory in a respective data structure for the region in a compressed form, such that the geometry data for the primitive to be rendered is stored in a compressed form, in a respective, different data structure for each different region of the render output it is determined the primitive should be rendered for.
-
公开(公告)号:US20210158584A1
公开(公告)日:2021-05-27
申请号:US16697903
申请日:2019-11-27
Applicant: Arm Limited
Abstract: When processing graphics primitives in a graphics processing system, the render output is divided into a plurality of regions for rendering, each region comprising a respective area of the render output. It is determined for which of the plurality of regions of the render output a primitive should be rendered for. Primitive data for rendering the primitive is then stored either in a combined data structure in memory that is associated with a plurality of different regions of the render output, or is stored in a respective data structure for each region of the render output it is determined the primitive should be rendered for. Which manner the primitive data is stored is determined in dependence on a property, e.g. a coverage, of the primitive.
-
公开(公告)号:US10599935B2
公开(公告)日:2020-03-24
申请号:US15439284
申请日:2017-02-22
Applicant: ARM Limited
Inventor: Alexander Eugene Chalfin , Hardik Sharma , Thomas Jeremy Olson
Abstract: A data processing apparatus processes a set of weight values for an artificial neural network by representing the set of weight values in the form of an array of weight values and by using an image compression scheme to provide compressed weight data for the artificial neural network. The data processing apparatus uses an image decompression scheme to derive decompressed weight values from the compressed weight data and applies the decompressed weight values when producing a result from an input to the artificial neural network. The data processing apparatus can provide for efficient storage and processing of the weight values for the artificial neural network.
-
公开(公告)号:US20240036919A1
公开(公告)日:2024-02-01
申请号:US18358995
申请日:2023-07-26
Applicant: Arm Limited
Inventor: Alexander Eugene Chalfin , John Wakefield Brothers, III , Rune Holm , Samuel James Edward Martin
CPC classification number: G06F9/4881 , G06T1/20
Abstract: A method and processor comprising a command processing unit to receive, from a host processor, a sequence of commands to be executed; and generate based on the sequence of commands a plurality of tasks. The processor also comprises a plurality of compute units each having a first processing module for executing tasks of a first task type, a second processing module for executing tasks of a second task type, different from the first task type, and a local cache shared by at least the first processing module and the second processing module. The command processing unit issues the plurality of tasks to at least one of the plurality of compute units, and wherein at least one of the plurality of compute units is to process at least one of the plurality of tasks.
-
公开(公告)号:US20230196093A1
公开(公告)日:2023-06-22
申请号:US17559163
申请日:2021-12-22
Applicant: Arm Limited
Inventor: Kartikeya Bhardwaj , Naveen Suda , Lingchuan Meng , Alexander Eugene Chalfin , Danny Daysang Log
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Disclosed is a novel neural network architecture and methods for generating neural network-based models from such architecture. A first version of the neural network, that is used for training purposes, includes one or more blocks in a first format that can then be replaced with corresponding blocks in a second format for execution. An executable model can thus be provided comprising a second version of the neural network including the one or more blocks in the second format. This then allows the training to be performed in a first, e.g. expanded format, but with a second, e.g. reduced, format model then provided for execution.
-
-
-
-
-
-
-
-
-