Patent search ap:("ARM Limited") AND inv:"Andreas Due ENGH-HALSTVEDT" Page 1

1.

发明申请
ATOMIC ADD WITH CARRY INSTRUCTION 审中-公开

公开(公告)号：US20170315805A1

公开(公告)日：2017-11-02

申请号：US15528924

申请日：2015-11-03

Applicant: ARM LIMITED

Inventor： Andreas Due ENGH-HALSTVEDT

IPC: G06F9/30 , G06F12/0875

CPC classification number: G06F9/3001 , G06F9/30014 , G06F9/3004 , G06F9/3009 , G06F9/30094 , G06F9/3016 , G06F9/3851 , G06F9/3887 , G06F12/0875 , G06F2212/452

Abstract: Processing circuitry performs processing operations specified by program instructions. An instruction decoder decodes an atomic-add-with-carry instruction AAD-DC to control the processing circuitry to perform an atomic operation of an add of an addend operand value and a data value stored in a memory to generate a result value stored in the memory and a carry value indicative of whether or not the add generated a carry out.

2.

发明申请
A TILE BASED GRAPHICS PROCESSOR AND A METHOD OF PERFORMING GRAPHICS PROCESSING IN A TILE BASED GRAPHICS PROCESSOR 有权
Title translation: 基于层的图形处理器和在基于图形的图形处理器中执行图形处理的方法

公开(公告)号：US20160110837A1

公开(公告)日：2016-04-21

申请号：US14874829

申请日：2015-10-05

Applicant: ARM LIMITED

Inventor： Isidoros SIDERIS , Michel Patrick Gabriel Emil IWANIEC , Andrew BURDASS , Nebojsa MAKLJENOVIC , Andreas Due ENGH-HALSTVEDT

IPC: G06T1/00 , G06T1/60

CPC classification number: G06T11/40 , G06T1/20 , G06T1/60 , G06T15/00 , G06T15/005 , G06T15/40 , G06T15/405 , G06T2207/20021

Abstract: A graphics processing apparatus and method of performing graphics processing are provided. The graphics processing apparatus comprises a sequence of processing stages capable of performing graphics processing to generate a frame of display data. The graphics processing is performed on a tile-by-tile basis. The graphics processing apparatus is capable of determining if a current tile subject to the graphics processing is empty. At least one processing stage of the sequence of processing stages is omitted for graphics processing of the current tile in dependence on whether the current tile is empty.

Abstract translation: 提供了执行图形处理的图形处理装置和方法。图形处理装置包括能够执行图形处理以生成显示数据的帧的处理级的序列。图形处理以逐个瓦片为基础进行。图形处理装置能够确定受图形处理影响的当前图块是否为空。对于当前瓦片的图形处理，根据当前瓦片是否为空，省略处理级序列的至少一个处理级。

3.

发明申请
CONTROLLING PRIORITY LEVELS OF PENDING THREADS AWAITING PROCESSING 有权
Title translation: 控制垂直螺纹加工的优先级

公开(公告)号：US20130305255A1

公开(公告)日：2013-11-14

申请号：US13942816

申请日：2013-07-16

Applicant: ARM LIMITED

Inventor： Nebojsa MAKLJENOVIC , Edvard FIELDING , Andreas Due ENGH-HALSTVEDT

IPC: G06F9/54

CPC classification number: G06F9/54 , G06F9/3851 , G06F9/3859 , G06F9/5011 , G06F2209/5021 , G06F2209/507

Abstract: A data processing apparatus comprises processing circuitry arranged to process processing threads using resources accessible to the processing circuitry. A pipeline is provided for handling at least two pending threads awaiting processing by the processing circuitry. The pipeline includes at least one resource-requesting pipeline stage for requesting access to resources for the pending threads. A priority controller controls priority levels of the pending threads. The priority levels define a priority with which pending threads are granted access to resources. When a pending thread reaches a final pipeline stage, if the request resources are not yet available then the priority level of that thread is raised selectively and the thread is returned to a first pipeline stage of the pipeline. If the requested resources are available then the thread is forwarded from the pipeline.

Abstract translation: 数据处理装置包括处理电路，其布置成使用处理电路可访问的资源来处理处理线程。提供管线用于处理待处理电路等待处理的至少两个待处理线程。流水线包括至少一个资源请求流水线级，用于请求访问待处理线程的资源。优先级控制器控制待处理线程的优先级。优先级定义优先级，通过该优先级等待线程授予对资源的访问权限。当待处理线程达到最终流水线阶段时，如果请求资源不可用，则该线程的优先级级别被有选择地提升，并且该线程返回到流水线的第一流水线级。如果所请求的资源可用，则线程将从管道转发。

4.

发明公开
DATA PROCESSING 审中-公开

公开(公告)号：US20230305963A1

公开(公告)日：2023-09-28

申请号：US18188147

申请日：2023-03-22

Applicant: Arm Limited

Inventor： Andreas Due ENGH-HALSTVEDT , Philip Michael WATTS

IPC: G06F12/0837 , G06F12/122

CPC classification number: G06F12/0837 , G06F12/122

Abstract: A data processor, such as a graphics processor, is disclosed. The data processor includes a set of one or more counters, and a control circuit that maintains a cache-like pool of corresponding entries. In response to a request for a counter, the control circuit may allocate an entry of the cache-like pool to thereby allocate a counter of the set.

5.

发明申请
CLIPPING OF GRAPHICS PRIMITIVES 有权
Title translation: 图形原理的剪辑

公开(公告)号：US20150161814A1

公开(公告)日：2015-06-11

申请号：US14536070

申请日：2014-11-07

Applicant: ARM Limited

Inventor： Andreas Due ENGH-HALSTVEDT , Frode Heggelund , Jørn Nystad

IPC: G06T15/30 , G06T15/83 , G06T17/10

CPC classification number: G06T15/30 , G06T1/20 , G06T1/60 , G06T15/005 , G06T2210/52

Abstract: Techniques for performing clipping of graphics primitives 60 with respect to a clipping boundary 65 are described. The clipping step 10 may be performed separately for each tile of a graphics frame to be rendered, after a primitive list for the tile has been read from a primitive memory 38. Clipping may be performed only for larger primitives whose size exceeds a given threshold. Clipping of a primitive 60 to the clipping boundary 65 may be performed inexactly so that only a single clipped primitive is generated which may extend beyond the clipping boundary. A clipped primitive generated by clipping may be used for a depth function calculation of a primitive setup operation and not for an edge determination.

Abstract translation: 描述用于执行关于剪切边界65的图形基元60的削波的技术。在从原始存储器38读取瓦片的原始列表之后，可以针对要渲染的图形帧的每个瓦片分别执行限幅步骤10.对于尺寸超过给定阈值的较大图元，可以执行裁剪。可以精确地执行将原始图像60剪切到剪切边界65，使得仅生成可以延伸超过剪切边界的单个剪切的图元。由削波产生的剪切原语可用于原始设置操作的深度函数计算，而不用于边缘确定。

6.

发明申请
DATA PROCESSING APPARATUS AND METHOD FOR PROCESSING A RECEIVED WORKLOAD IN ORDER TO GENERATE RESULT DATA 有权
Title translation: 数据处理设备和用于处理接收到的工作负载以生成结果数据的方法

公开(公告)号：US20130332939A1

公开(公告)日：2013-12-12

申请号：US13909149

申请日：2013-06-04

Applicant: ARM Limited

Inventor： Andreas Due ENGH-HALSTVEDT , Jorn NYSTAD

IPC: G06F9/46

CPC classification number: G06F9/46 , G06F9/4881 , G06F2209/483 , G06F2209/484 , G06T1/20 , Y02D10/24

Abstract: A data processing apparatus and method are provided for processing a received workload in order to generate result data. A thread group generator generates from the received workload a plurality of thread groups to be executed to process the received workload. Each thread group consists of a plurality of threads, and at least one thread group has an inter-thread dependency existing between the plurality of threads. Each thread may be either an active thread whose output is required to form the result data, or a dummy thread required to resolve the inter-thread dependency for one of the active threads but whose output is not required to form the result data. The thread group generator identifies for each thread group any dummy thread within that thread group. A thread execution unit then executes each thread within a thread group received from the thread group generator by executing a predetermined program comprising a plurality of program instructions. Execution flow modification circuitry is responsive to the received thread group having at least one dummy thread, to cause the thread execution unit to selectively omit at least part of the execution of at least one of the plurality of instructions when executing each dummy thread, in dependence on control information associated with the predetermined program. In one particular embodiment the received workload is a graphics rendering workload and the thread execution unit performs graphics rendering operations in order to generate as the result data pixel values and associated control values. Such an approach can yield significant improvements in performance, as well as reducing power consumption.

Abstract translation: 提供了一种数据处理装置和方法，用于处理所接收的工作负载以产生结果数据。线程组生成器从接收到的工作负载生成要执行的多个线程组以处理所接收的工作负载。每个线程组由多个线程组成，并且至少一个线程组具有存在于多个线程之间的线间依存关系。每个线程可以是要求其输出来形成结果数据的活动线程，也可以是解决对其中一个活动线程但不需要输出结果数据的线程间依赖性所需的虚拟线程。线程组生成器为每个线程组标识该线程组中的任何虚拟线程。线程执行单元然后通过执行包括多个程序指令的预定程序来执行从线程组生成器接收的线程组内的每个线程。执行流修改电路响应于具有至少一个虚拟线程的所接收的线程组，以使得线程执行单元在执行每个虚拟线程时有选择地省略至少一部分执行多条指令，依赖关于与预定程序相关联的控制信息。在一个特定实施例中，所接收的工作负载是图形渲染工作负载，并且线程执行单元执行图形绘制操作，以便生成结果数据像素值和相关联的控制值。这种方法可以显着提高性能，同时降低功耗。

7.

发明申请
DATA PROCESSING SYSTEMS 有权

公开(公告)号：US20220164128A1

公开(公告)日：2022-05-26

申请号：US17455601

申请日：2021-11-18

Applicant: Arm Limited

Inventor： Olof Henrik UHRENHOLT , Andreas Due ENGH-HALSTVEDT

IPC: G06F3/06

Abstract: A data processing system includes an external memory system, a processor and an internal memory system. The internal memory system includes an internal memory that stores data for use by the processor when performing data processing operations. The internal memory system also includes a data encoder associated with the internal memory. The data encoder reads data from the external memory system to the data encoder and returns the data to the external memory system from the data encoder, without storing the data in the internal memory.

8.

发明申请
APPARATUS, METHOD AND PROGRAM FOR CALCULATING THE RESULT OF A REPEATING ITERATIVE SUM 有权
Title translation: 用于计算重复迭代结果的设备，方法和程序

公开(公告)号：US20160124708A1

公开(公告)日：2016-05-05

申请号：US14878562

申请日：2015-10-08

Applicant: ARM Limited

Inventor： Andreas Due ENGH-HALSTVEDT , Edvard FIELDING

IPC: G06F5/01

CPC classification number: G06F7/506 , G06F7/5272 , G06F7/535 , H03M7/24

Abstract: An apparatus, method and program are provided for calculating a result value to a required precision of a repeating iterative sum, wherein the repeating iterative sum comprises multiple iterations of an addition using an input value. Addition is performed in a single iteration of addition as a sum operation using overlapping portions of the input value and a shifted version of the input value, wherein the shifted version of the input value has a partial overlap with the input value. At least one result portion is produced by incrementing an input derived from the input value using the output from the sum operation and the result value is constructed using the at least one result portion to give the result value to the required precision. The repeating iterative sum is thereby flattened into a flattened calculation which requires only a single iteration of addition using the input value, thus facilitating the calculation of the result value of the repeating iterative sum.

Abstract translation: 提供了一种用于将结果值计算为重复迭代和的所需精度的装置，方法和程序，其中所述重复迭代和包括使用输入值的加法的多次迭代。在加法的单次迭代中，作为使用输入值的重叠部分和输入值的移位版本的求和运算进行加法，其中输入值的移位版本与输入值具有部分重叠。至少一个结果部分通过使用和操作的输出递增从输入值导出的输入而产生，并且使用至少一个结果部分构造结果值，以将结果值提供给所需精度。因此，重复迭代和被平坦化为仅需要使用输入值的单次迭代迭代的扁平化计算，因此有助于计算重复迭代和的结果值。

9.

发明申请
THREAD ISSUE CONTROL 有权
Title translation: 螺纹问题控制

公开(公告)号：US20150227376A1

公开(公告)日：2015-08-13

申请号：US14596948

申请日：2015-01-14

Applicant: ARM Limited

Inventor： Andreas Due ENGH-HALSTVEDT , Ian Victor DEVEREUX , David BERMINGHAM , Jakob Alex FRIES , Oskar Lars FLORDAL

IPC: G06F9/38 , G06F12/08

CPC classification number: G06F9/3869 , G06F9/38 , G06F9/3816 , G06F9/3855 , G06F9/3867 , G06F12/0855 , G06F2212/455

Abstract: A data processing system includes a processing pipeline for the parallel execution of a plurality of threads. An issue controller issues threads to the processing pipeline. A stall manager controls the stalling and unstalling of threads when a cache miss occurs within a cache memory. The issue controller issues the threads to the processing pipeline in accordance with both a main sequence and a pilot sequence. The pilot sequence is followed such that threads within the pilot sequence are issued at least a given time ahead of their neighbours within a main sequence. The given time corresponds approximately to the latency associated with a cache miss. The threads may be arranged in groups corresponding to blocks of pixels for processing within a graphics processing unit.

Abstract translation: 数据处理系统包括用于并行执行多个线程的处理流水线。问题控制器向处理管道发出线程。缓存管理器控制在高速缓存内存中发生高速缓存未命中时线程的停止和卸载。问题控制器根据主序列和导频序列将线程发出到处理流水线。跟随导频序列，使得导频序列内的线程在主序列内的至少一个给定的时间之前被发送到它们的邻居之前。给定的时间大致对应于与高速缓存未命中关联的等待时间。线程可以以对应于像素块的组排列，以在图形处理单元内进行处理。

10.

发明公开
GRAPHICS PROCESSORS 审中-公开

公开(公告)号：US20240348935A1

公开(公告)日：2024-10-17

申请号：US18754006

申请日：2024-06-25

Applicant: Arm Limited

Inventor： Daniel Fedai LARSEN , Tord Kvestad ØYGARD , Frank Klaeboe LANGTIND , Andreas Due ENGH-HALSTVEDT

IPC: H04N23/73 , G06T5/92

CPC classification number: H04N23/73 , G06T5/92 , G06T2207/20172

Abstract: A method of processing data in a graphics processor when performing tile-based rendering in which a render output is sub-divided into a plurality of tiles for rendering. The rendering is performed as two separate processing passes: a first processing pass that sorts primitives into respective regions of the render output and a second processing pass that renders the tiles into which the render output is sub-divided for rendering. During the first processing pass, “tile elimination” data is generated indicative of which of the rendering tiles should be rendered during the second processing pass. The tile elimination data generated in the first processing pass can then be used to control the rendering of tiles during the second processing pass.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification