GARBAGE COLLECTING WAVEFRONT
    41.
    发明申请

    公开(公告)号:US20230097115A1

    公开(公告)日:2023-03-30

    申请号:US17485662

    申请日:2021-09-27

    Abstract: A processing system executes a specialized wavefront, referred to as a “garbage collecting wavefront” or GCWF, to identify and deallocate resources such as, for example, scalar registers, vector registers, and local data share space, that are no longer being used by wavefronts of a workgroup executing at the processing system (i.e., dead resources). In some embodiments, the GCWF is programmed to have compiler information regarding the resource requirements of the other wavefronts of the workgroup and specifies the program counter after which there will be a permanent drop in resource requirements for the other wavefronts. In other embodiments, the standard compute wavefronts signal the GCWF when they have completed using resources. The GCWF sends a command to deallocate the dead resources so the dead resources can be made available for additional wavefronts.

    CONDENSED COMMAND PACKET FOR HIGH THROUGHPUT AND LOW OVERHEAD KERNEL LAUNCH

    公开(公告)号:US20220197696A1

    公开(公告)日:2022-06-23

    申请号:US17133574

    申请日:2020-12-23

    Abstract: Methods, devices, and systems for launching a compute kernel. A reference kernel dispatch packet is received by a kernel agent. The reference kernel dispatch packet is processed by the kernel agent to determine kernel dispatch information. The kernel dispatch information is stored by the kernel agent. A kernel is dispatched by the kernel agent, based on the kernel dispatch information. In some implementations, a condensed kernel dispatch packet is received by the kernel agent, the condensed kernel dispatch packet is processed by the kernel agent to retrieve the stored kernel dispatch information, and a kernel is dispatched by the kernel agent based on the retrieved kernel dispatch information.

    SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS
    44.
    发明申请
    SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS 审中-公开
    使用优先计算单元的系统性能管理

    公开(公告)号:US20170004080A1

    公开(公告)日:2017-01-05

    申请号:US14755401

    申请日:2015-06-30

    CPC classification number: G06F12/084 G06F2212/1021

    Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

    Abstract translation: 用于管理具有多个计算单元的处理器的性能的方法,设备和系统。 可以确定多个计算单元的有效数量以指定为优先级。 在有效数字为非零的条件下,多个计算单元的有效数量可以各自被指定为优先计算单元。 优先计算单元可以访问共享高速缓存,而非优先级计算单元可能不具有访问权限。 工作组可以优先地分派到优先计算单元。 来自优先计算单元的存储器访问请求可以在来自非优先级计算单元的请求之前提供。

    HETEROGENEOUS FUNCTION UNIT DISPATCH IN A GRAPHICS PROCESSING UNIT
    45.
    发明申请
    HETEROGENEOUS FUNCTION UNIT DISPATCH IN A GRAPHICS PROCESSING UNIT 审中-公开
    图形处理单元中异构功能单元分配

    公开(公告)号:US20160085551A1

    公开(公告)日:2016-03-24

    申请号:US14490213

    申请日:2014-09-18

    CPC classification number: G06F9/3887 G06F9/3851

    Abstract: A compute unit configured to execute multiple threads in parallel is presented. The compute unit includes one or more single instruction multiple data (SIMD) units and a fetch and decode logic. The SIMD units have differing numbers of arithmetic logic units (ALUs), such that each SIMD unit can execute a different number of threads. The fetch and decode logic is in communication with each of the SIMD units, and is configured to assign the threads to the SIMD units for execution based on such differing numbers of ALUs.

    Abstract translation: 呈现并行执行多个线程的计算单元。 计算单元包括一个或多个单指令多数据(SIMD)单元和读取和解码逻辑。 SIMD单元具有不同数量的算术逻辑单元(ALU),使得每个SIMD单元可以执行不同数量的线程。 获取和解码逻辑与每个SIMD单元通信,并且被配置为基于这样不同数量的ALU将线程分配给SIMD单元以供执行。

Patent Agency Ranking