Methods and apparatus for constant data storage

    公开(公告)号:US11657471B2

    公开(公告)日:2023-05-23

    申请号:US17356434

    申请日:2021-06-23

    CPC classification number: G06T1/20 G06T1/60

    Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may generate a table including a plurality of entries to store data associated with at least one of a constant value or an immediate value. The apparatus may also process, upon generating the table, first data including at least one of a constant value or an immediate value. Further, the apparatus may store, in the generated table, at least one of the constant value or the immediate value of the first data. The apparatus may also transmit, upon storing at least one of the constant value or the immediate value in the table, the table including the stored at least one of the constant value or the immediate value of the first data.

    GENERAL PURPOSE REGISTER ALLOCATION IN STREAMING PROCESSOR

    公开(公告)号:US20180165092A1

    公开(公告)日:2018-06-14

    申请号:US15379195

    申请日:2016-12-14

    Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.

    Runtime mechanism to optimize shader execution flow

    公开(公告)号:US12229864B2

    公开(公告)日:2025-02-18

    申请号:US17817815

    申请日:2022-08-05

    Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.

    Run-time mechanism for optimal shader

    公开(公告)号:US12067666B2

    公开(公告)日:2024-08-20

    申请号:US17664033

    申请日:2022-05-18

    CPC classification number: G06T15/005 G06T1/60

    Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.

    INTER-PROCESSOR COMMUNICATION TECHNIQUES IN A MULTIPLE-PROCESSOR COMPUTING PLATFORM
    5.
    发明申请
    INTER-PROCESSOR COMMUNICATION TECHNIQUES IN A MULTIPLE-PROCESSOR COMPUTING PLATFORM 有权
    多处理器计算平台中的处理器间通信技术

    公开(公告)号:US20150097849A1

    公开(公告)日:2015-04-09

    申请号:US14570974

    申请日:2014-12-15

    CPC classification number: G06F9/544 G06F9/54 G06F9/546 G06T1/20 G06T1/60

    Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.

    Abstract translation: 本公开描述了可以在多处理器计算平台内使用的通信技术。 在一些示例中,这些技术可以提供软件接口,其可以用于支持使用命令队列发起任务的多处理器计算平台内的消息传递。 在另外的示例中,这些技术可以提供可用于多处理器计算平台内的共享存储器处理器间通信的软件接口。 在另外的示例中,这些技术可以提供图形处理单元(GPU),其包括用于支持GPU和主机CPU之间的消息传递和/或共享存储器通信的硬件。

    General purpose register allocation in streaming processor

    公开(公告)号:US10558460B2

    公开(公告)日:2020-02-11

    申请号:US15379195

    申请日:2016-12-14

    Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.

    Utilizing pipeline registers as intermediate storage

    公开(公告)号:US09747104B2

    公开(公告)日:2017-08-29

    申请号:US14275047

    申请日:2014-05-12

    Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.

    SKIPPING OF DATA STORAGE
    9.
    发明申请
    SKIPPING OF DATA STORAGE 有权
    数据存储的移动

    公开(公告)号:US20160054998A1

    公开(公告)日:2016-02-25

    申请号:US14462932

    申请日:2014-08-19

    Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.

    Abstract translation: 描述了其中包括指示以指示作为确定最终值的一部分而生成的中间值的最后使用的指示不被存储在通用寄存器(GPR)中的技术。 处理单元基于指示,避免将中间值存储在GPR中,因为不再需要中间值来确定最终值。

Patent Agency Ranking