Flexibly deriving intended thread data exchange patterns

    公开(公告)号:US10664285B1

    公开(公告)日:2020-05-26

    申请号:US16226411

    申请日:2018-12-19

    Abstract: A method of deriving intended thread data exchange patterns from source code includes identifying, based on a constant array, a pattern of data exchange between a plurality of threads in a wavefront. The constant array includes an array of source lane values identifying a thread location within the wavefront to read from for performing the pattern of data exchange. The pattern of data exchange is identified as a hardware-accelerated exchange pattern based on the constant array.

    Method and apparatus of cross shader compilation

    公开(公告)号:US11080927B2

    公开(公告)日:2021-08-03

    申请号:US15827909

    申请日:2017-11-30

    Abstract: A method and apparatus provides for compiling a plurality of shaders, each shader having a plurality of computer-readable statements, into a plurality of computer-executable instructions. In one example, the method and apparatus, using a computing device, receives the plurality of shaders used in a process pipeline for performing at least one shading function, determines a shader type of each of the plurality of shaders based on the at least one shading function, and compiles the plurality of shaders by generating the computer-executable instructions using data including a shader descriptor for each of the plurality of shaders, resulting in the shading functions of the plurality of shaders combined together.

    STREAMING WAVE COALESCER CIRCUIT
    4.
    发明申请

    公开(公告)号:US20250068429A1

    公开(公告)日:2025-02-27

    申请号:US18536982

    申请日:2023-12-12

    Abstract: A Streaming Wave Coalescer (SWC) circuit stores a first set of state values associated with a first subset of threads of a first wave in a bin based on each of the first subset of threads including a first set of instructions to be executed. A second set of state values associated with a second subset of threads of a second wave is stored in the bin based on each of the second subset of threads including the first set of instructions to be executed and based on the first wave and the second wave both being associated with a hard key. A third wave is formed from the threads of the first subset and the second subset and is emitted for execution. As a result of reorganizing the threads and reconstituting a different wave, thread divergence of waves sent for execution is reduced.

    Register saving for function calling

    公开(公告)号:US11113061B2

    公开(公告)日:2021-09-07

    申请号:US16584775

    申请日:2019-09-26

    Abstract: Described herein are techniques for saving registers in the event of a function call. The techniques include modifying a program including a block of code designated as a calling code that calls a function. The modifying includes modifying the calling code to set a register usage mask indicating which registers are in use at the time of the function call. The modifying also includes modifying the function to combine the information of the register usage mask with information indicating registers used by the function to generate registers to be saved and save the registers to be saved.

    MACHINE LEARNING-BASED TECHNIQUE FOR EXECUTION MODE SELECTION

    公开(公告)号:US20210065441A1

    公开(公告)日:2021-03-04

    申请号:US16584750

    申请日:2019-09-26

    Abstract: Described herein are techniques for generating a compiled shader program. The techniques include identifying input features of a shader program, providing the identified input features of the shader program to a trained model for selecting compiler operation values for shader programs, receiving, as output from the trained model, a compiler operation value for the shader program, and generating a compiled shader program based on the compiler operation value for execution on one or more compute units.

    METHOD AND APPARATUS OF CROSS SHADER COMPILATION

    公开(公告)号:US20190164337A1

    公开(公告)日:2019-05-30

    申请号:US15827909

    申请日:2017-11-30

    Abstract: A method and apparatus provides for compiling a plurality of shaders, each shader having a plurality of computer-readable statements, into a plurality of computer-executable instructions. In one example, the method and apparatus, using a computing device, receives the plurality of shaders used in a process pipeline for performing at least one shading function, determines a shader type of each of the plurality of shaders based on the at least one shading function, and compiles the plurality of shaders by generating the computer-executable instructions using data including a shader descriptor for each of the plurality of shaders, resulting in the shading functions of the plurality of shaders combined together.

    SOFTWARE-DEFINED COMPUTE UNIT RESOURCE ALLOCATION MODE

    公开(公告)号:US20240311199A1

    公开(公告)日:2024-09-19

    申请号:US18120646

    申请日:2023-03-13

    CPC classification number: G06F9/505 G06F9/522

    Abstract: A program code executing on a processing system includes one or more instructions each identifying a workload that includes a plurality of waves and each identifying resource allocations for the plurality of waves of the workgroup. In response to receiving an instruction identifying a workload and resource allocations for the plurality of waves of the workgroup, a processor allocates a first set of processing resources to a compute unit of the processor based on the resource allocations for the plurality of waves. The compute unit then performs operations for the workgroup using the allocated set of processing resources.

Patent Agency Ranking