Method and apparatus for nested control flow of instructions using context information and instructions having extra bits
    1.
    发明授权
    Method and apparatus for nested control flow of instructions using context information and instructions having extra bits 有权
    使用上下文信息和具有额外位的指令嵌套控制指令流的方法和装置

    公开(公告)号:US07281122B2

    公开(公告)日:2007-10-09

    申请号:US10756853

    申请日:2004-01-14

    CPC classification number: G06F9/325 G06F9/30072 G06F9/3885

    Abstract: A method and apparatus for nested control flow includes a processor having at least one context bit. The processor includes a plurality of arithmetic logic units for performing single instruction multiple data (SIMD) operations. The method and apparatus further includes a first memory device storing a plurality of instructions wherein each of the plurality of instructions includes a plurality of extra bits. The processor is operative to execute the instructions based on the extra bits and in conjunction with a context bit. The method and apparatus further includes a second memory device, such as a general purpose register operably coupled to the processor, the second memory device receiving an incrementing counter instruction upon the execution of one of the plurality of instructions. As such, the method and apparatus allows for nested control flow through a single context bit in conjunction with instructions having a plurality of extra bits.

    Abstract translation: 用于嵌套控制流的方法和装置包括具有至少一个上下文比特的处理器。 处理器包括用于执行单指令多数据(SIMD)操作的多个算术逻辑单元。 该方法和装置还包括存储多个指令的第一存储器件,其中多个指令中的每一个指令包括多个额外的位。 处理器可操作以基于额外的比特并结合上下文比特来执行指令。 所述方法和装置还包括第二存储器装置,诸如可操作地耦合到处理器的通用寄存器,第二存储器装置在执行多个指令之一时接收递增计数器指令。 因此,该方法和装置允许通过单个上下文比特结合具有多个额外比特的指令来嵌套控制流。

    Method and apparatus for nested control flow
    2.
    发明申请
    Method and apparatus for nested control flow 有权
    嵌套控制流程的方法和装置

    公开(公告)号:US20050154864A1

    公开(公告)日:2005-07-14

    申请号:US10756853

    申请日:2004-01-14

    CPC classification number: G06F9/325 G06F9/30072 G06F9/3885

    Abstract: A method and apparatus for nested control flow includes a processor having at least one context bit. The processor includes a plurality of arithmetic logic units for performing single instruction multiple data (SIMD) operations. The method and apparatus further includes a first memory device storing a plurality of instructions wherein each of the plurality of instructions includes a plurality of extra bits. The processor is operative to execute the instructions based on the extra bits and in conjunction with a context bit. The method and apparatus further includes a second memory device, such as a general purpose register operably coupled to the processor, the second memory device receiving an incrementing counter instruction upon the execution of one of the plurality of instructions. As such, the method and apparatus allows for nested control flow through a single context bit in conjunction with instructions having a plurality of extra bits.

    Abstract translation: 用于嵌套控制流的方法和装置包括具有至少一个上下文比特的处理器。 处理器包括用于执行单指令多数据(SIMD)操作的多个算术逻辑单元。 该方法和装置还包括存储多个指令的第一存储器件,其中多个指令中的每一个指令包括多个额外的位。 处理器可操作以基于额外的比特并结合上下文比特来执行指令。 所述方法和装置还包括第二存储器装置,诸如可操作地耦合到处理器的通用寄存器,第二存储器装置在执行多个指令之一时接收递增计数器指令。 因此,该方法和装置允许通过单个上下文比特结合具有多个额外比特的指令来嵌套控制流。

    Techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values
    4.
    发明授权
    Techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values 有权
    基于目标alpha值在图形处理系统中减少存储器访问带宽的技术

    公开(公告)号:US09087409B2

    公开(公告)日:2015-07-21

    申请号:US13409993

    申请日:2012-03-01

    Applicant: Andrew Gruber

    Inventor: Andrew Gruber

    CPC classification number: G06T15/40 G06T15/04 G06T2200/28

    Abstract: This disclosure describes techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values. The techniques may include retrieving a destination alpha value from a bin buffer, the destination alpha value being generated in response to processing a first pixel associated with a first primitive. The techniques may further include determining, based on the destination alpha value, whether to perform an action that causes one or more texture values for a second pixel to not be retrieved from a texture buffer. In some examples, the action may include discarding the second pixel from a pixel processing pipeline prior to the second pixel arriving at a texture mapping stage of the pixel processing pipeline. The second pixel may be associated with a second primitive different than the first primitive.

    Abstract translation: 本公开描述了用于基于目的地alpha值减少图形处理系统中的存储器访问带宽的技术。 这些技术可以包括从箱体缓冲器检索目的地α值,响应于处理与第一基元相关联的第一像素而产生目的地α值。 所述技术可以进一步包括基于目的地α值来确定是否执行导致不从纹理缓冲器检索第二像素的一个或多个纹理值的动作。 在一些示例中,动作可以包括在第二像素到达像素处理流水线的纹理映射阶段之前从像素处理流水线丢弃第二像素。 第二像素可以与不同于第一图元的第二图元相关联。

    SELECTIVELY ACTIVATING A RESUME CHECK OPERATION IN A MULTI-THREADED PROCESSING SYSTEM
    5.
    发明申请
    SELECTIVELY ACTIVATING A RESUME CHECK OPERATION IN A MULTI-THREADED PROCESSING SYSTEM 有权
    选择性地激活多线程处理系统中的恢复检查操作

    公开(公告)号:US20140047223A1

    公开(公告)日:2014-02-13

    申请号:US13624657

    申请日:2012-09-21

    Abstract: This disclosure describes techniques for selectively activating a resume check operation in a single instruction, multiple data (SIMD) processing system. A processor is described that is configured to selectively enable or disable a resume check operation for a particular instruction based on information included in the instruction that indicates whether a resume check operation is to be performed for the instruction. A compiler is also described that is configured to generate compiled code which, when executed, causes a resume check operation to be selectively enabled or disabled for particular instructions. The compiled code may include one or more instructions that each specify whether a resume check operation is to be performed for the respective instruction. The techniques of this disclosure may be used to reduce the power consumption of and/or improve the performance of a SIMD system that utilizes a resume check operation to manage the reactivation of deactivated threads.

    Abstract translation: 本公开描述了用于在单个指令,多数据(SIMD)处理系统中选择性地激活恢复检查操作的技术。 描述了一种处理器,其被配置为基于指示是否对该指令执行恢复检查操作的指令中的信息选择性地启用或禁用特定指令的恢复检查操作。 还描述了一种编译器,其被配置为生成编译代码,其在被执行时导致对特定指令选择性地启用或禁用恢复检查操作。 编译代码可以包括一个或多个指令,每个指令指定是否对相应的指令执行恢复检查操作。 本公开的技术可以用于减少利用恢复检查操作来管理停用线程的重新激活的SIMD系统的功耗和/或提高性能。

    DEFERRED PREEMPTION TECHNIQUES FOR SCHEDULING GRAPHICS PROCESSING UNIT COMMAND STREAMS
    6.
    发明申请
    DEFERRED PREEMPTION TECHNIQUES FOR SCHEDULING GRAPHICS PROCESSING UNIT COMMAND STREAMS 有权
    调度图形处理单元命令流程的延迟预留技术

    公开(公告)号:US20140022266A1

    公开(公告)日:2014-01-23

    申请号:US13554805

    申请日:2012-07-20

    CPC classification number: G06F9/4806 G06F9/461 G06F9/4812 G06T1/20

    Abstract: This disclosure is directed to deferred preemption techniques for scheduling graphics processing unit (GPU) command streams for execution on a GPU. A host CPU is described that is configured to control a GPU to perform deferred-preemption scheduling. For example, a host CPU may select one or more locations in a GPU command stream as being one or more locations at which preemption is allowed to occur in response to receiving a preemption notification, and may place one or more tokens in the GPU command stream based on the selected one or more locations. The tokens may indicate to the GPU that preemption is allowed to occur at the selected one or more locations. This disclosure further describes a GPU configured to preempt execution of a GPU command stream based on one or more tokens placed in a GPU command stream.

    Abstract translation: 本公开涉及用于调度用于在GPU上执行的图形处理单元(GPU)命令流的延迟抢占技术。 描述了被配置为控制GPU执行延迟抢占调度的主机CPU。 例如,主机CPU可以选择GPU命令流中的一个或多个位置作为响应于接收到抢占通知而允许发生抢占的一个或多个位置,并且可以在GPU命令流中放置一个或多个令牌 基于所选择的一个或多个位置。 令牌可以向GPU指示允许在所选择的一个或多个位置发生抢占。 该公开进一步描述了配置成基于放置在GPU命令流中的一个或多个令牌来抢占GPU命令流的执行的GPU。

    COMPUTATIONAL RESOURCE PIPELINING IN GENERAL PURPOSE GRAPHICS PROCESSING UNIT
    7.
    发明申请
    COMPUTATIONAL RESOURCE PIPELINING IN GENERAL PURPOSE GRAPHICS PROCESSING UNIT 有权
    一般用途图形处理单元的计算资源管理

    公开(公告)号:US20120185671A1

    公开(公告)日:2012-07-19

    申请号:US13007333

    申请日:2011-01-14

    CPC classification number: G06F15/17325

    Abstract: This disclosure describes techniques for extending the architecture of a general purpose graphics processing unit (GPGPU) with parallel processing units to allow efficient processing of pipeline-based applications. The techniques include configuring local memory buffers connected to parallel processing units operating as stages of a processing pipeline to hold data for transfer between the parallel processing units. The local memory buffers allow on-chip, low-power, direct data transfer between the parallel processing units. The local memory buffers may include hardware-based data flow control mechanisms to enable transfer of data between the parallel processing units. In this way, data may be passed directly from one parallel processing unit to the next parallel processing unit in the processing pipeline via the local memory buffers, in effect transforming the parallel processing units into a series of pipeline stages.

    Abstract translation: 本公开描述了用于利用并行处理单元来扩展通用图形处理单元(GPGPU)的架构以允许基于流水线的应用的有效处理的技术。 这些技术包括配置连接到作为处理流水线的阶段的并行处理单元的本地存储器缓冲器,以保持用于在并行处理单元之间传送的数据。 本地存储缓冲器允许并行处理单元之间的片上,低功耗,直接数据传输。 本地存储器缓冲器可以包括基于硬件的数据流控制机制,以使得能够在并行处理单元之间传送数据。 以这种方式,数据可以经由本地存储器缓冲器直接从一个并行处理单元传递到处理流水线中的下一个并行处理单元,实际上将并行处理单元转换成一系列流水线级。

    MULTI-THREAD GRAPHICS PROCESSING SYSTEM
    8.
    发明申请
    MULTI-THREAD GRAPHICS PROCESSING SYSTEM 有权
    多线程图形处理系统

    公开(公告)号:US20070222787A1

    公开(公告)日:2007-09-27

    申请号:US11746453

    申请日:2007-05-09

    Abstract: A graphics processing system comprises at least one memory device storing a plurality of pixel command threads and a plurality of vertex command threads. An arbiter coupled to the at least one memory device is provided that selects a command thread from either the plurality of pixel or vertex command threads based on relative priorities of the plurality of pixel command threads and the plurality of vertex command threads. The selected command thread is provided to a command processing engine capable of processing pixel command threads and vertex command threads.

    Abstract translation: 图形处理系统包括存储多个像素命令线程和多个顶点命令线程的至少一个存储器件。 提供耦合到所述至少一个存储器件的仲裁器,其基于所述多个像素命令线程和所述多个顶点命令线程的相对优先级从所述多个像素或顶点命令线程中选择命令线程。 所选择的命令线程被提供给能够处理像素命令线程和顶点命令线程的命令处理引擎。

    Apparatus to control memory accesses in a video system and method thereof
    9.
    发明授权
    Apparatus to control memory accesses in a video system and method thereof 有权
    用于控制视频系统中的存储器访问的装置及其方法

    公开(公告)号:US06542159B1

    公开(公告)日:2003-04-01

    申请号:US09314209

    申请日:1999-05-19

    CPC classification number: G09G5/39 G09G2360/121

    Abstract: A method and apparatus for dynamic issuing of memory access instructions. In particular, a specific data access request that is about to be sent to a memory, such as a frame buffer, is dynamically chosen based upon pending requests within a pipeline. It is possible to optimize video data requests by dynamically selecting a memory access request at the time the request is made to the memory. In particular, if it is recognized that the memory about to be accessed will no longer be needed by subsequent memory requests, the request can be changed from a normal access request to an access request with an auto-close option. By using an auto close option, the memory bank being accessed is closed after the access, without issuing a separate memory close instruction.

    Abstract translation: 一种用于动态发布存储器访问指令的方法和装置。 特别地,将基于流水线内的未决请求动态地选择要发送到诸如帧缓冲器的存储器的特定数据访问请求。 可以通过在对存储器进行请求时动态地选择存储器访问请求来优化视频数据请求。 特别地,如果认识到随后的存储器请求将不再需要要访问的存储器,则可以通过自动关闭选项将请求从正常访问请求改变为访问请求。 通过使用自动关闭选项,访问存储库在访问后关闭,而不发出单独的存储器关闭指令。

    Synchronization of shader operation
    10.
    发明授权
    Synchronization of shader operation 有权
    着色器操作的同步

    公开(公告)号:US09442780B2

    公开(公告)日:2016-09-13

    申请号:US13186236

    申请日:2011-07-19

    Applicant: Andrew Gruber

    Inventor: Andrew Gruber

    CPC classification number: G06F9/544 G06F9/52 G06T1/20 G06T1/60 G06T15/005

    Abstract: The example techniques described in this disclosure may be directed to synchronization between producer shaders and consumer shaders. For example, a graphics processing unit (GPU) may execute a producer shader to produce graphics data. After the completion of the production of graphics data, the producer shader may store a value indicative of the amount of produced graphics data. The GPU may execute one or more consumer shaders, after the storage of the value indicative of the amount of produced graphics data, to consume the produced graphics data.

    Abstract translation: 本公开中描述的示例技术可以涉及生成器着色器和消费者着色器之间的同步。 例如,图形处理单元(GPU)可以执行生成器着色器以产生图形数据。 在完成图形数据的制作之后,制作者着色器可以存储指示所生成的图形数据量的值。 在存储指示所产生的图形数据的量的值之后,GPU可以执行一个或多个消费者着色器,以消耗所产生的图形数据。

Patent Agency Ranking