CONTROLLING PRIORITY LEVELS OF PENDING THREADS AWAITING PROCESSING
    3.
    发明申请
    CONTROLLING PRIORITY LEVELS OF PENDING THREADS AWAITING PROCESSING 有权
    控制垂直螺纹加工的优先级

    公开(公告)号:US20130305255A1

    公开(公告)日:2013-11-14

    申请号:US13942816

    申请日:2013-07-16

    Applicant: ARM LIMITED

    Abstract: A data processing apparatus comprises processing circuitry arranged to process processing threads using resources accessible to the processing circuitry. A pipeline is provided for handling at least two pending threads awaiting processing by the processing circuitry. The pipeline includes at least one resource-requesting pipeline stage for requesting access to resources for the pending threads. A priority controller controls priority levels of the pending threads. The priority levels define a priority with which pending threads are granted access to resources. When a pending thread reaches a final pipeline stage, if the request resources are not yet available then the priority level of that thread is raised selectively and the thread is returned to a first pipeline stage of the pipeline. If the requested resources are available then the thread is forwarded from the pipeline.

    Abstract translation: 数据处理装置包括处理电路,其布置成使用处理电路可访问的资源来处理处理线程。 提供管线用于处理待处理电路等待处理的至少两个待处理线程。 流水线包括至少一个资源请求流水线级,用于请求访问待处理线程的资源。 优先级控制器控制待处理线程的优先级。 优先级定义优先级,通过该优先级等待线程授予对资源的访问权限。 当待处理线程达到最终流水线阶段时,如果请求资源不可用,则该线程的优先级级别被有选择地提升,并且该线程返回到流水线的第一流水线级。 如果所请求的资源可用,则线程将从管道转发。

    DATA PROCESSING
    4.
    发明公开
    DATA PROCESSING 审中-公开

    公开(公告)号:US20230305963A1

    公开(公告)日:2023-09-28

    申请号:US18188147

    申请日:2023-03-22

    Applicant: Arm Limited

    CPC classification number: G06F12/0837 G06F12/122

    Abstract: A data processor, such as a graphics processor, is disclosed. The data processor includes a set of one or more counters, and a control circuit that maintains a cache-like pool of corresponding entries. In response to a request for a counter, the control circuit may allocate an entry of the cache-like pool to thereby allocate a counter of the set.

    CLIPPING OF GRAPHICS PRIMITIVES
    5.
    发明申请
    CLIPPING OF GRAPHICS PRIMITIVES 有权
    图形原理的剪辑

    公开(公告)号:US20150161814A1

    公开(公告)日:2015-06-11

    申请号:US14536070

    申请日:2014-11-07

    Applicant: ARM Limited

    CPC classification number: G06T15/30 G06T1/20 G06T1/60 G06T15/005 G06T2210/52

    Abstract: Techniques for performing clipping of graphics primitives 60 with respect to a clipping boundary 65 are described. The clipping step 10 may be performed separately for each tile of a graphics frame to be rendered, after a primitive list for the tile has been read from a primitive memory 38. Clipping may be performed only for larger primitives whose size exceeds a given threshold. Clipping of a primitive 60 to the clipping boundary 65 may be performed inexactly so that only a single clipped primitive is generated which may extend beyond the clipping boundary. A clipped primitive generated by clipping may be used for a depth function calculation of a primitive setup operation and not for an edge determination.

    Abstract translation: 描述用于执行关于剪切边界65的图形基元60的削波的技术。 在从原始存储器38读取瓦片的原始列表之后,可以针对要渲染的图形帧的每个瓦片分别执行限幅步骤10.对于尺寸超过给定阈值的较大图元,可以执行裁剪。 可以精确地执行将原始图像60剪切到剪切边界65,使得仅生成可以延伸超过剪切边界的单个剪切的图元。 由削波产生的剪切原语可用于原始设置操作的深度函数计算,而不用于边缘确定。

    DATA PROCESSING APPARATUS AND METHOD FOR PROCESSING A RECEIVED WORKLOAD IN ORDER TO GENERATE RESULT DATA
    6.
    发明申请
    DATA PROCESSING APPARATUS AND METHOD FOR PROCESSING A RECEIVED WORKLOAD IN ORDER TO GENERATE RESULT DATA 有权
    数据处理设备和用于处理接收到的工作负载以生成结果数据的方法

    公开(公告)号:US20130332939A1

    公开(公告)日:2013-12-12

    申请号:US13909149

    申请日:2013-06-04

    Applicant: ARM Limited

    Abstract: A data processing apparatus and method are provided for processing a received workload in order to generate result data. A thread group generator generates from the received workload a plurality of thread groups to be executed to process the received workload. Each thread group consists of a plurality of threads, and at least one thread group has an inter-thread dependency existing between the plurality of threads. Each thread may be either an active thread whose output is required to form the result data, or a dummy thread required to resolve the inter-thread dependency for one of the active threads but whose output is not required to form the result data. The thread group generator identifies for each thread group any dummy thread within that thread group. A thread execution unit then executes each thread within a thread group received from the thread group generator by executing a predetermined program comprising a plurality of program instructions. Execution flow modification circuitry is responsive to the received thread group having at least one dummy thread, to cause the thread execution unit to selectively omit at least part of the execution of at least one of the plurality of instructions when executing each dummy thread, in dependence on control information associated with the predetermined program. In one particular embodiment the received workload is a graphics rendering workload and the thread execution unit performs graphics rendering operations in order to generate as the result data pixel values and associated control values. Such an approach can yield significant improvements in performance, as well as reducing power consumption.

    Abstract translation: 提供了一种数据处理装置和方法,用于处理所接收的工作负载以产生结果数据。 线程组生成器从接收到的工作负载生成要执行的多个线程组以处理所接收的工作负载。 每个线程组由多个线程组成,并且至少一个线程组具有存在于多个线程之间的线间依存关系。 每个线程可以是要求其输出来形成结果数据的活动线程,也可以是解决对其中一个活动线程但不需要输出结果数据的线程间依赖性所需的虚拟线程。 线程组生成器为每个线程组标识该线程组中的任何虚拟线程。 线程执行单元然后通过执行包括多个程序指令的预定程序来执行从线程组生成器接收的线程组内的每个线程。 执行流修改电路响应于具有至少一个虚拟线程的所接收的线程组,以使得线程执行单元在执行每个虚拟线程时有选择地省略至少一部分执行多条指令,依赖 关于与预定程序相关联的控制信息。 在一个特定实施例中,所接收的工作负载是图形渲染工作负载,并且线程执行单元执行图形绘制操作,以便生成结果数据像素值和相关联的控制值。 这种方法可以显着提高性能,同时降低功耗。

    DATA PROCESSING SYSTEMS
    7.
    发明申请

    公开(公告)号:US20220164128A1

    公开(公告)日:2022-05-26

    申请号:US17455601

    申请日:2021-11-18

    Applicant: Arm Limited

    Abstract: A data processing system includes an external memory system, a processor and an internal memory system. The internal memory system includes an internal memory that stores data for use by the processor when performing data processing operations. The internal memory system also includes a data encoder associated with the internal memory. The data encoder reads data from the external memory system to the data encoder and returns the data to the external memory system from the data encoder, without storing the data in the internal memory.

    APPARATUS, METHOD AND PROGRAM FOR CALCULATING THE RESULT OF A REPEATING ITERATIVE SUM
    8.
    发明申请
    APPARATUS, METHOD AND PROGRAM FOR CALCULATING THE RESULT OF A REPEATING ITERATIVE SUM 有权
    用于计算重复迭代结果的设备,方法和程序

    公开(公告)号:US20160124708A1

    公开(公告)日:2016-05-05

    申请号:US14878562

    申请日:2015-10-08

    Applicant: ARM Limited

    CPC classification number: G06F7/506 G06F7/5272 G06F7/535 H03M7/24

    Abstract: An apparatus, method and program are provided for calculating a result value to a required precision of a repeating iterative sum, wherein the repeating iterative sum comprises multiple iterations of an addition using an input value. Addition is performed in a single iteration of addition as a sum operation using overlapping portions of the input value and a shifted version of the input value, wherein the shifted version of the input value has a partial overlap with the input value. At least one result portion is produced by incrementing an input derived from the input value using the output from the sum operation and the result value is constructed using the at least one result portion to give the result value to the required precision. The repeating iterative sum is thereby flattened into a flattened calculation which requires only a single iteration of addition using the input value, thus facilitating the calculation of the result value of the repeating iterative sum.

    Abstract translation: 提供了一种用于将结果值计算为重复迭代和的所需精度的装置,方法和程序,其中所述重复迭代和包括使用输入值的加法的多次迭代。 在加法的单次迭代中,作为使用输入值的重叠部分和输入值的移位版本的求和运算进行加法,其中输入值的移位版本与输入值具有部分重叠。 至少一个结果部分通过使用和操作的输出递增从输入值导出的输入而产生,并且使用至少一个结果部分构造结果值,以将结果值提供给所需精度。 因此,重复迭代和被平坦化为仅需要使用输入值的单次迭代迭代的扁平化计算,因此有助于计算重复迭代和的结果值。

    THREAD ISSUE CONTROL
    9.
    发明申请
    THREAD ISSUE CONTROL 有权
    螺纹问题控制

    公开(公告)号:US20150227376A1

    公开(公告)日:2015-08-13

    申请号:US14596948

    申请日:2015-01-14

    Applicant: ARM Limited

    Abstract: A data processing system includes a processing pipeline for the parallel execution of a plurality of threads. An issue controller issues threads to the processing pipeline. A stall manager controls the stalling and unstalling of threads when a cache miss occurs within a cache memory. The issue controller issues the threads to the processing pipeline in accordance with both a main sequence and a pilot sequence. The pilot sequence is followed such that threads within the pilot sequence are issued at least a given time ahead of their neighbours within a main sequence. The given time corresponds approximately to the latency associated with a cache miss. The threads may be arranged in groups corresponding to blocks of pixels for processing within a graphics processing unit.

    Abstract translation: 数据处理系统包括用于并行执行多个线程的处理流水线。 问题控制器向处理管道发出线程。 缓存管理器控制在高速缓存内存中发生高速缓存未命中时线程的停止和卸载。 问题控制器根据主序列和导频序列将线程发出到处理流水线。 跟随导频序列,使得导频序列内的线程在主序列内的至少一个给定的时间之前被发送到它们的邻居之前。 给定的时间大致对应于与高速缓存未命中关联的等待时间。 线程可以以对应于像素块的组排列,以在图形处理单元内进行处理。

    GRAPHICS PROCESSORS
    10.
    发明公开
    GRAPHICS PROCESSORS 审中-公开

    公开(公告)号:US20240348935A1

    公开(公告)日:2024-10-17

    申请号:US18754006

    申请日:2024-06-25

    Applicant: Arm Limited

    CPC classification number: H04N23/73 G06T5/92 G06T2207/20172

    Abstract: A method of processing data in a graphics processor when performing tile-based rendering in which a render output is sub-divided into a plurality of tiles for rendering. The rendering is performed as two separate processing passes: a first processing pass that sorts primitives into respective regions of the render output and a second processing pass that renders the tiles into which the render output is sub-divided for rendering. During the first processing pass, “tile elimination” data is generated indicative of which of the rendering tiles should be rendered during the second processing pass. The tile elimination data generated in the first processing pass can then be used to control the rendering of tiles during the second processing pass.

Patent Agency Ranking