Task execution on a graphics processor using indirect argument buffers

    公开(公告)号:US11094036B2

    公开(公告)日:2021-08-17

    申请号:US16850101

    申请日:2020-04-16

    Applicant: Apple Inc.

    Abstract: The disclosure pertains to techniques for operation of graphics systems and task execution on a graphics processor. One such technique comprises a computer-implemented method for task execution on a graphics processor, the method comprising creating a data structure for grouping data resources, populating the data structure with two or more data resources for encoding into a graphics processing language by an encoding object, passing the data structure to a first programming interface command, the first programming interface command configured to access the data structure's data resources, triggering execution of a first function on a graphics processer in response to passing the data structure to the first programming interface command, passing the data structure to a second programming interface command, the second programming interface command configured to access the data structure's data resources, and triggering execution of a second function on the graphics processer in response to passing the data structure to the second programming interface command.

    Graphics Pipeline State Object And Model
    2.
    发明申请
    Graphics Pipeline State Object And Model 审中-公开
    图形管道状态对象和模型

    公开(公告)号:US20150348224A1

    公开(公告)日:2015-12-03

    申请号:US14501933

    申请日:2014-09-30

    Applicant: Apple Inc.

    CPC classification number: G06T1/20 G06F3/14 G06F8/47 G06T15/80 G06T2200/28

    Abstract: An innovative GPU framework and related APIs present more accurate representations of the target hardware so that the distinctions between the fixed-function and programmable features of the GPU are perceived by a developer. This permits a program and/or a graphics object generated or manipulated by the program to be understood as not just code, but machine states that are associated with the code. When such an object is defined, the definitional components requiring programmable GPU features can be compiled only once and reused repeatedly as needed. Similarly, when a state change is made, the state changes correspond to the state changes made on the hardware. Additionally, the creation of these immutable objects prevents a developer from inadvertently changing portions of the program or object that cause it to behave differently than intended.

    Abstract translation: 创新的GPU框架和相关的API提供目标硬件的更准确的表示,以便GPU的固定功能和可编程功能之间的区别被开发人员所感知。 这允许由程序生成或操纵的程序和/或图形对象被理解为不仅仅是代码,而是与代码相关联的机器状态。 当定义这样的对象时,需要可编程GPU特征的定义组件只能编译一次,并根据需要重复使用。 类似地,当进行状态改变时,状态改变对应于在硬件上进行的状态改变。 另外,这些不可变对象的创建可以防止开发人员无意中更改程序或对象的部分内容,导致其行为与预期的不同。

    Combining compute tasks for a graphics processing unit
    3.
    发明授权
    Combining compute tasks for a graphics processing unit 有权
    组合图形处理单元的计算任务

    公开(公告)号:US09442706B2

    公开(公告)日:2016-09-13

    申请号:US14448927

    申请日:2014-07-31

    Applicant: Apple Inc.

    CPC classification number: G06F8/4441 G06F9/445 G06F9/44505

    Abstract: Methods, systems and devices are disclosed to examine developer supplied graphics code and attributes at run-time. The graphics code designed for execution on a graphics processing unit (GPU) utilizing a coding language such as OpenCL or OpenGL which provides for run-time analysis by a driver, code generator, and compiler. Developer supplied code and attributes can be analyzed and altered based on the execution capabilities and performance criteria of a GPU on which the code is about to be executed. In general, reducing the number of developer defined work items or work groups can reduce the initialization cost of the GPU with respect to the work to be performed and result in an overall optimization of the machine code. Manipulation code can be added to adjust the supplied code in a manner similar to unrolling a loop to improve execution performance.

    Abstract translation: 披露方法,系统和设备,以在运行时检查开发人员提供的图形代码和属性。 设计用于使用诸如OpenCL或OpenGL的编码语言在图形处理单元(GPU)上执行的图形代码,其提供由驱动程序,代码生成器和编译器进行的运行时分析。 开发人员提供的代码和属性可以根据代码即将执行的GPU的执行能力和性能标准进行分析和更改。 通常,减少开发者定义的工作项或工作组的数量可以降低GPU相对于要执行的工作的初始化成本,并导致机器代码的整体优化。 可以添加操作代码以类似于展开循环的方式调整提供的代码,以提高执行性能。

    System And Method For Unified Application Programming Interface And Model
    4.
    发明申请
    System And Method For Unified Application Programming Interface And Model 审中-公开
    统一应用编程接口与模型的系统与方法

    公开(公告)号:US20150348225A1

    公开(公告)日:2015-12-03

    申请号:US14502073

    申请日:2014-09-30

    Applicant: Apple Inc.

    CPC classification number: G06T1/20 G06F9/30145 G06F9/545 G06T2200/28

    Abstract: Systems, computer readable media, and methods for a unified programming interface and language are disclosed. In one embodiment, the unified programming interface and language assists program developers write multi-threaded programs that can perform both graphics and data-parallel compute processing on GPUs. The same GPU programming language model can be used to describe both graphics shaders and compute kernels, and the same data structures and resources may be used for both graphics and compute operations. Developers can use multithreading efficiently to create and submit command buffers in parallel.

    Abstract translation: 公开了用于统一编程接口和语言的系统,计算机可读介质和方法。 在一个实施例中,统一编程接口和语言协助程序开发人员编写可以对GPU执行图形和数据并行计算处理的多线程程序。 可以使用相同的GPU编程语言模型来描述图形着色器和计算内核,并且相同的数据结构和资源可用于图形和计算操作。 开发人员可以有效地使用多线程来并行创建和提交命令缓冲区。

    System and method for unified application programming interface and model

    公开(公告)号:US10346941B2

    公开(公告)日:2019-07-09

    申请号:US14502073

    申请日:2014-09-30

    Applicant: Apple Inc.

    Abstract: Systems, computer readable media, and methods for a unified programming interface and language are disclosed. In one embodiment, the unified programming interface and language assists program developers write multi-threaded programs that can perform both graphics and data-parallel compute processing on GPUs. The same GPU programming language model can be used to describe both graphics shaders and compute kernels, and the same data structures and resources may be used for both graphics and compute operations. Developers can use multithreading efficiently to create and submit command buffers in parallel.

    Combining Compute Tasks For A Graphics Processing Unit
    6.
    发明申请
    Combining Compute Tasks For A Graphics Processing Unit 有权
    组合图形处理单元的计算任务

    公开(公告)号:US20150347105A1

    公开(公告)日:2015-12-03

    申请号:US14448927

    申请日:2014-07-31

    Applicant: Apple Inc.

    CPC classification number: G06F8/4441 G06F9/445 G06F9/44505

    Abstract: Methods, systems and devices are disclosed to examine developer supplied graphics code and attributes at run-time. The graphics code designed for execution on a graphics processing unit (GPU) utilizing a coding language such as OpenCL or OpenGL which provides for run-time analysis by a driver, code generator, and compiler. Developer supplied code and attributes can be analyzed and altered based on the execution capabilities and performance criteria of a GPU on which the code is about to be executed. In general, reducing the number of developer defined work items or work groups can reduce the initialization cost of the GPU with respect to the work to be performed and result in an overall optimization of the machine code. Manipulation code can be added to adjust the supplied code in a manner similar to unrolling a loop to improve execution performance.

    Abstract translation: 披露方法,系统和设备,以在运行时检查开发人员提供的图形代码和属性。 设计用于使用诸如OpenCL或OpenGL的编码语言在图形处理单元(GPU)上执行的图形代码,其提供由驱动程序,代码生成器和编译器进行的运行时分析。 开发人员提供的代码和属性可以根据代码即将执行的GPU的执行能力和性能标准进行分析和更改。 通常,减少开发者定义的工作项或工作组的数量可以降低GPU相对于要执行的工作的初始化成本,并导致机器代码的整体优化。 可以添加操作代码以类似于展开循环的方式调整提供的代码,以提高执行性能。

    System and method for unified application programming interface and model

    公开(公告)号:US10949944B2

    公开(公告)日:2021-03-16

    申请号:US16390577

    申请日:2019-04-22

    Applicant: Apple Inc.

    Abstract: Systems, computer readable media, and methods for a unified programming interface and language are disclosed. In one embodiment, the unified programming interface and language assists program developers write multi-threaded programs that can perform both graphics and data-parallel compute processing on GPUs. The same GPU programming language model can be used to describe both graphics shaders and compute kernels, and the same data structures and resources may be used for both graphics and compute operations. Developers can use multithreading efficiently to create and submit command buffers in parallel.

    Resource synchronization for graphics processing

    公开(公告)号:US10930047B2

    公开(公告)日:2021-02-23

    申请号:US16707455

    申请日:2019-12-09

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to synchronizing access to pixel resources. Examples of pixel resources include color attachments, a stencil buffer, and a depth buffer. In some embodiments, hardware registers are used to track status of assigned pixel resources and pixel wait and pixel release instruction are used to synchronize access to the pixel resources. In some embodiments, other accesses to the pixel resources may occur out of program order. Relative to tracking and ordering pass groups, this weak ordering and explicit synchronization may improve performance and reduce power consumption. Disclosed techniques may also facilitate coordination between fragment rendering threads and auxiliary mid-render compute tasks.

    Resource Synchronization for Graphics Processing

    公开(公告)号:US20200167986A1

    公开(公告)日:2020-05-28

    申请号:US16707455

    申请日:2019-12-09

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to synchronizing access to pixel resources. Examples of pixel resources include color attachments, a stencil buffer, and a depth buffer. In some embodiments, hardware registers are used to track status of assigned pixel resources and pixel wait and pixel release instruction are used to synchronize access to the pixel resources. In some embodiments, other accesses to the pixel resources may occur out of program order. Relative to tracking and ordering pass groups, this weak ordering and explicit synchronization may improve performance and reduce power consumption. Disclosed techniques may also facilitate coordination between fragment rendering threads and auxiliary mid-render compute tasks.

Patent Agency Ranking