-
公开(公告)号:US09953444B2
公开(公告)日:2018-04-24
申请号:US14874829
申请日:2015-10-05
Applicant: ARM LIMITED
Inventor: Isidoros Sideris , Michel Patrick Gabriel Emil Iwaniec , Andrew Burdass , Nebojsa Makljenovic , Andreas Due Engh-Halstvedt
CPC classification number: G06T11/40 , G06T1/20 , G06T1/60 , G06T15/00 , G06T15/005 , G06T15/40 , G06T15/405 , G06T2207/20021
Abstract: A graphics processing apparatus and method of performing graphics processing are provided. The graphics processing apparatus comprises a sequence of processing stages capable of performing graphics processing to generate a frame of display data. The graphics processing is performed on a tile-by-tile basis. The graphics processing apparatus is capable of determining if a current tile subject to the graphics processing is empty. At least one processing stage of the sequence of processing stages is omitted for graphics processing of the current tile in dependence on whether the current tile is empty.
-
公开(公告)号:US09753735B2
公开(公告)日:2017-09-05
申请号:US14596948
申请日:2015-01-14
Applicant: ARM Limited
Inventor: Andreas Due Engh-Halstvedt , Ian Victor Devereux , David Bermingham , Jakob Axel Fries , Oskar Lars Flordal
IPC: G06F9/38 , G06F12/08 , G06F12/0855
CPC classification number: G06F9/3869 , G06F9/38 , G06F9/3816 , G06F9/3855 , G06F9/3867 , G06F12/0855 , G06F2212/455
Abstract: A data processing system includes a processing pipeline for the parallel execution of a plurality of threads. An issue controller issues threads to the processing pipeline. A stall manager controls the stalling and unstalling of threads when a cache miss occurs within a cache memory. The issue controller issues the threads to the processing pipeline in accordance with both a main sequence and a pilot sequence. The pilot sequence is followed such that threads within the pilot sequence are issued at least a given time ahead of their neighbors within a main sequence. The given time corresponds approximately to the latency associated with a cache miss. The threads may be arranged in groups corresponding to blocks of pixels for processing within a graphics processing unit.
-
公开(公告)号:US12052508B2
公开(公告)日:2024-07-30
申请号:US18323768
申请日:2023-05-25
Applicant: Arm Limited
Inventor: Daniel Fedai Larsen , Tord Kvestad Øygard , Frank Klaeboe Langtind , Andreas Due Engh-Halstvedt
CPC classification number: H04N23/73 , G06T5/92 , G06T2207/20172
Abstract: A method of processing data in a graphics processor when performing tile-based rendering in which a render output is sub-divided into a plurality of tiles for rendering. The rendering is performed as two separate processing passes: a first processing pass that sorts primitives into respective regions of the render output and a second processing pass that renders the tiles into which the render output is sub-divided for rendering. During the first processing pass, “tile elimination” data is generated indicative of which of the rendering tiles should be rendered during the second processing pass. The tile elimination data generated in the first processing pass can then be used to control the rendering of tiles during the second processing pass.
-
14.
公开(公告)号:US11790479B2
公开(公告)日:2023-10-17
申请号:US17163289
申请日:2021-01-29
Applicant: Arm Limited
Inventor: Frank Klaeboe Langtind , Andreas Due Engh-Halstvedt
CPC classification number: G06T1/20 , G06T1/60 , G06T11/203
Abstract: When generating a graphics processing output, a sequence of one or more of primitives to be processed when generating the output is assembled from a set of vertex indices provided for the output based on primitive configuration information provided for the output, each assembled primitive of the sequence of assembled primitives comprising an identifier for the primitive and a set of one or more vertex indices for the primitive. One or more attributes for vertices of the assembled primitives are then shaded and fetched based on the vertex indices of the assembled primitives. The assembled primitives including their shaded fetched vertex attribute(s) are then provided to later stages of the graphics processing pipeline for processing.
-
公开(公告)号:US11189005B1
公开(公告)日:2021-11-30
申请号:US17004797
申请日:2020-08-27
Applicant: Arm Limited
Abstract: A method of operating a graphics processor that is configured to execute a graphics processing pipeline is provided. The method comprises the graphics processor reading, from an index buffer in external memory, a block of data comprising plural sets of indices, each set of indices comprising a sequence of indices indexing a set of vertices that defines a primitive of a plurality of primitives to be processed by the graphics processing pipeline. The graphics processor compresses the block of data to form a compressed version of the block of data, and stores the compressed version of the block of data in an internal memory of the graphics processor.
-
公开(公告)号:US20210216455A1
公开(公告)日:2021-07-15
申请号:US16742495
申请日:2020-01-14
Applicant: Arm Limited
Inventor: Olof Henrik Uhrenholt , Andreas Due Engh-Halstvedt
IPC: G06F12/0811 , G06F3/06 , G06F9/38 , G06F9/30
Abstract: A data processing system includes a cache system configured to transfer data stored in the memory system to a processor and to transfer data from the processor to the memory system. The cache system comprises a cache and a data encoder associated with the cache that is configured to encode uncompressed data from the cache for storing in the memory system in a compressed format, and decode compressed data from the memory system for storing in the cache in an uncompressed format.
-
公开(公告)号:US10650577B2
公开(公告)日:2020-05-12
申请号:US15246970
申请日:2016-08-25
Applicant: ARM Limited
Inventor: Andreas Due Engh-Halstvedt , Frank Langtind
Abstract: A tile-based graphics processing pipeline includes a back-facing determination and culling unit that is operable to cull back-facing triangles before the tiling stage. The back-facing determination and culling unit include a triangle size estimator that estimates the size of a triangle being considered. If the size of the triangle is less than a selected size, then the area of the triangle is calculated using fixed point arithmetic and the result of that area calculation is used by a back-face culling unit to determine whether to cull the triangle or not. On the other hand, if the size estimator determines that the primitive is greater than the selected size, then the triangle bypasses the fixed point area calculation and back-face culling unit and is instead passed directly to the tiler.
-
公开(公告)号:US10599584B2
公开(公告)日:2020-03-24
申请号:US15806237
申请日:2017-11-07
Applicant: Arm Limited
Inventor: Andreas Due Engh-Halstvedt , Frank Langtind , Shareef Justin Jalloq
IPC: G06F12/10 , G06F12/1045 , G06F12/0804 , G06F12/0891 , G06F12/0875 , G06F12/126 , G06F12/0895
Abstract: When writing data to memory via a write buffer including a write cache containing a plurality of lines for storing data to be written to memory and an address-translation cache that stores a list of virtual address to physical address translations, a record of a set of lines of the write cache that are available to be evicted to the memory is maintained, and the evictable lines in the record of evictable lines are processed by requesting from the address-translation cache a respective physical address for each virtual address associated with an evictable line. The address-translation cache returns a hit or a miss status to the write buffer for each evictable line that is checked, and the write buffer writes out to memory at least one of the evictable lines for which a hit status was returned.
-
19.
公开(公告)号:US10216479B2
公开(公告)日:2019-02-26
申请号:US15370660
申请日:2016-12-06
Applicant: ARM Limited
Abstract: An apparatus and method are provided for performing arithmetic operations to accumulate floating-point numbers. The apparatus comprises execution circuitry to perform arithmetic operations, and decoder circuitry to decode a sequence of instructions. A convert and accumulate instruction is provided, and the decoder circuitry is responsive to decoding the convert and accumulate instruction to generate one or more control signals to control the execution circuitry to convert at least one floating-point operand identified by the convert and accumulate instruction into a corresponding N-bit fixed-point operand having M fraction bits, where M is less than N and M is dependent on a format of the floating-point operand. The execution circuitry accumulates each corresponding N bit fixed-point operand and a P bit fixed-point operand identified by the convert and accumulate instruction in order to generate a P bit fixed-point result value, where P is greater than N and also has M fraction bits.
-
公开(公告)号:US10157132B1
公开(公告)日:2018-12-18
申请号:US15661200
申请日:2017-07-27
Applicant: ARM Limited
Inventor: Edvard Fielding , Andreas Due Engh-Halstvedt , Jorn Nystad , Antonio Garcia Guirado , William Robert Stoye , Ian Rudolf Bratt
IPC: G06F12/00 , G06F13/00 , G06F13/28 , G06F12/0811 , G06F12/0875 , G06F12/0862 , G06F12/0846 , G06F12/0868
Abstract: A method of operating a data processing system comprises maintaining record of a set of processing passes to be performed by processing pass circuitry of the data processing system. The method comprises performing cycles of operation in which it is considered whether or not the data required for a subset of processing passes is stored in a local cache. The subset of processing passes that is considered in a subsequent scan of the record comprises at least one processing pass that was not considered in the previous scan of the record, regardless of whether or not the data considered in the previous scan is determined as being stored in the cache. The method provides an efficient way to identify processing passes that are ready to be performed.
-
-
-
-
-
-
-
-
-