COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION
    2.
    发明申请
    COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION 审中-公开
    计算机螺旋桨阵列精度执行预警

    公开(公告)号:US20130132711A1

    公开(公告)日:2013-05-23

    申请号:US13302962

    申请日:2011-11-22

    IPC分类号: G06F9/38

    CPC分类号: G06F9/461

    摘要: One embodiment of the present invention sets forth a technique instruction level and compute thread array granularity execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. When preemption is performed at a compute thread array boundary, the amount of context state to be stored is reduced because execution units within the processing pipeline complete execution of in-flight instructions and become idle. If, the amount of time needed to complete execution of the in-flight instructions exceeds a threshold, then the preemption may dynamically change to be performed at the instruction level instead of at compute thread array granularity.

    摘要翻译: 本发明的一个实施例阐述了技术指令级别和计算线程数组粒度执行抢占。 在指令级别抢占不需要处理管道的任何排水。 不会发出新的指令,并且从处理流水线中卸载上下文状态。 当在计算线程数组边界执行抢占时,由于处理流程内的执行单元完成飞行中指令的执行并变为空闲,因此减少了要存储的上下文状态量。 如果完成执行飞行中指令所需的时间超过阈值,则抢占可以动态地改变以在指令级别而不是以计算线程数组粒度来执行。

    Multi-Channel Time Slice Groups
    3.
    发明申请
    Multi-Channel Time Slice Groups 有权
    多通道时间片组

    公开(公告)号:US20130152093A1

    公开(公告)日:2013-06-13

    申请号:US13316334

    申请日:2011-12-09

    IPC分类号: G06F9/46

    摘要: A time slice group (TSG) is a grouping of different streams of work (referred to herein as “channels”) that share the same context information. The set of channels belonging to a TSG are processed in a pre-determined order. However, when a channel stalls while processing, the next channel with independent work can be switched to fully load the parallel processing unit. Importantly, because each channel in the TSG shares the same context information, a context switch operation is not needed when the processing of a particular channel in the TSG stops and the processing of a next channel in the TSG begins. Therefore, multiple independent streams of work are allowed to run concurrently within a single context increasing utilization of parallel processing units.

    摘要翻译: 时间片组(TSG)是共享相同上下文信息的不同工作流(本文称为“信道”)的分组。 属于TSG的信道集合以预定的顺序被处理。 然而,当通道在处理过程中停顿时,可以切换具有独立工作的下一个通道来完全加载并行处理单元。 重要的是,由于TSG中的每个信道共享相同的上下文信息,当TSG中的特定信道的处理停止并且TSG中的下一个信道的处理开始时,不需要上下文切换操作。 因此,允许多个独立的工作流在单个上下文中同时运行,增加并行处理单元的利用率。

    COMPUTE TASK STATE ENCAPSULATION
    4.
    发明申请
    COMPUTE TASK STATE ENCAPSULATION 审中-公开
    计算机任务状态包络

    公开(公告)号:US20130117751A1

    公开(公告)日:2013-05-09

    申请号:US13292951

    申请日:2011-11-09

    IPC分类号: G06F9/46

    摘要: One embodiment of the present invention sets forth a technique for encapsulating compute task state that enables out-of-order scheduling and execution of the compute tasks. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes. Each group is maintained as a linked list of pointers to compute tasks that are encoded as task metadata (TMD) stored in memory. A TMD encapsulates the state and parameters needed to initialize, schedule, and execute a compute task.

    摘要翻译: 本发明的一个实施例提出了一种用于封装计算任务状态的技术,该计算任务状态实现计算任务的无序调度和执行。 调度电路基于优先级将计算任务组织成组。 然后可以使用不同的调度方案来选择计算任务来执行。 维护每个组作为指针的链接列表,以计算任务被编码为存储在存储器中的任务元数据(TMD)。 TMD封装了初始化,调度和执行计算任务所需的状态和参数。

    CONTROLLING WORK DISTRIBUTION FOR PROCESSING TASKS
    5.
    发明申请
    CONTROLLING WORK DISTRIBUTION FOR PROCESSING TASKS 有权
    控制工作分配处理任务

    公开(公告)号:US20130198759A1

    公开(公告)日:2013-08-01

    申请号:US13363350

    申请日:2012-01-31

    IPC分类号: G06F9/46

    摘要: A technique for controlling the distribution of compute task processing in a multi-threaded system encodes each processing task as task metadata (TMD) stored in memory. The TMD includes work distribution parameters specifying how the processing task should be distributed for processing. Scheduling circuitry selects a task for execution when entries of a work queue for the task have been written. The work distribution parameters may define a number of work queue entries needed before a cooperative thread array” (“CTA”) may be launched to process the work queue entries according to the compute task. The work distribution parameters may define a number of CTAS that are launched to process the same work queue entries. Finally, the work distribution parameters may define a step size that is used to update pointers to the work queue entries.

    摘要翻译: 用于控制多线程系统中的计算任务处理的分布的技术将每个处理任务编码为存储在存储器中的任务元数据(TMD)。 TMD包括指定如何分配处理任务以进行处理的工作分配参数。 调度电路在写入任务的工作队列的条目时选择执行任务。 工作分配参数可以定义在协作线程数组“(”CTA“)可以根据计算任务发起处理工作队列条目之前所需的工作队列条目数,工作分配参数可以定义多个CTAS, 被启动以处理相同的工作队列条目。最后,工作分配参数可以定义用于更新指向工作队列条目的指针的步长。

    SYSTEM AND METHOD FOR USING DOMAINS TO IDENTIFY DEPENDENT AND INDEPENDENT OPERATIONS
    6.
    发明申请
    SYSTEM AND METHOD FOR USING DOMAINS TO IDENTIFY DEPENDENT AND INDEPENDENT OPERATIONS 有权
    使用域识别相关和独立操作的系统和方法

    公开(公告)号:US20130070760A1

    公开(公告)日:2013-03-21

    申请号:US13233927

    申请日:2011-09-15

    IPC分类号: H04L12/56

    CPC分类号: H04L47/215 G06F9/5066

    摘要: One embodiment of the present invention is a control unit for distributing packets of work to one or more consumer of works. The control unit is configured to assign at least one processing domain from a set of processing domains to each consumer included in the one or more consumers, receive a plurality of packets of work from at least one producer of work, wherein each packet of work is associated with a processing domain from the set of processing domains, and a first packet of work associated with a first processing domain can be processed by the one or more consumers independently of a second packet of work associated with a second processing domain, identify a first consumer that has been assigned the first processing domain, and transmit the first packet of work to the first consumer for processing.

    摘要翻译: 本发明的一个实施例是用于将工作分组分配给一个或多个作品消费者的控制单元。 控制单元被配置为从一组处理域到至少一个消费者中包括的每个消费者分配至少一个处理域,从至少一个工作生产者接收多个工作包,其中每个工作包是 与来自所述一组处理域的处理域相关联,并且与第一处理域相关联的第一工作分组可由所述一个或多个消费者独立于与第二处理域相关联的第二工作分组处理, 被分配了第一处理域的消费者,并将第一包的工作传送给第一消费者进行处理。