APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER

    公开(公告)号:US20170192822A9

    公开(公告)日:2017-07-06

    申请号:US14613339

    申请日:2015-02-03

    CPC classification number: G06F9/5038 G06F9/3851 G06F9/3887 G06F9/4881

    Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

    HARDWARE SCHEDULING OF ORDERED CRITICAL CODE SECTIONS
    2.
    发明申请
    HARDWARE SCHEDULING OF ORDERED CRITICAL CODE SECTIONS 有权
    硬件安排订购的关键代码段

    公开(公告)号:US20140123150A1

    公开(公告)日:2014-05-01

    申请号:US13660741

    申请日:2012-10-25

    Abstract: One embodiment sets forth a technique for scheduling the execution of ordered critical code sections by multiple threads. A multithreaded processor includes an instruction scheduling unit that is configured to schedule threads to process ordered critical code sections. A ordered critical code section is preceded by a barrier instruction and when all of the threads have reached the barrier instruction, the instruction scheduling unit controls the thread execution order by selecting each thread for execution based on logical identifiers associated with the threads. The logical identifiers are mapped to physical identifiers that are referenced by the multithreaded processor during execution of the threads. The logical identifiers are used by the instruction scheduling unit to control the order in which the threads execute the ordered critical code section.

    Abstract translation: 一个实施例提出了一种用于通过多个线程来调度有序关键代码段的执行的技术。 多线程处理器包括指令调度单元,其被配置为调度线程以处理有序的关键代码段。 有序的关键代码部分之前是屏障指令,并且当所有线程已经到达屏障指令时,指令调度单元通过基于与线程相关联的逻辑标识符选择用于执行的每个线程来控制线程执行顺序。 逻辑标识符被映射到在执行线程期间由多线程处理器引用的物理标识符。 逻辑标识符被指令调度单元用于控制线程执行有序关键代码段的顺序。

    PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS
    3.
    发明申请
    PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS 有权
    可编程图形处理程序,用于多方案执行程序

    公开(公告)号:US20160300319A9

    公开(公告)日:2016-10-13

    申请号:US13850175

    申请日:2013-03-25

    CPC classification number: G06T1/20 G06F9/38 G06F9/3851

    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    Abstract translation: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER
    4.
    发明申请
    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER 审中-公开
    可配置的基于相位优先级调度器的方法

    公开(公告)号:US20160224386A1

    公开(公告)日:2016-08-04

    申请号:US14613339

    申请日:2015-02-03

    CPC classification number: G06F9/5038 G06F9/3851 G06F9/3887 G06F9/4881

    Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

    Abstract translation: 并行处理子系统中的流多处理器(SM)调度多个线程中的优先级。 SM检索与线程组相关联的优先级描述符,并确定线程组和第二线程组是否都处于同一阶段。 如果是,则该方法确定线程组的优先级描述符是否指示比第二线程组的优先级描述符更高的优先级。 如果是这样,则SM相对于第二个线程组倾斜线程组,使得线程组以不同的阶段运行,否则SM增加了线程组的优先级。 线程组不在同一个阶段运行,则SM会增加线程组的优先级。 所公开的技术的一个优点是线程组以更高的效率执行,从而提高了处理器性能。

    TREE-BASED THREAD MANAGEMENT
    6.
    发明申请

    公开(公告)号:US20150205607A1

    公开(公告)日:2015-07-23

    申请号:US14160334

    申请日:2014-01-21

    CPC classification number: G06F9/3851 G06F9/3887 G06F9/528

    Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.

    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER
    7.
    发明申请
    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER 有权
    可配置的基于相位优先级调度器的方法

    公开(公告)号:US20140189698A1

    公开(公告)日:2014-07-03

    申请号:US13728828

    申请日:2012-12-27

    CPC classification number: G06F9/5038 G06F9/3851 G06F9/3887 G06F9/4881

    Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

    Abstract translation: 并行处理子系统中的流多处理器(SM)调度多个线程中的优先级。 SM检索与线程组相关联的优先级描述符,并确定线程组和第二线程组是否都处于同一阶段。 如果是,则该方法确定线程组的优先级描述符是否指示比第二线程组的优先级描述符更高的优先级。 如果是这样,则SM相对于第二个线程组倾斜线程组,使得线程组以不同的阶段运行,否则SM增加了线程组的优先级。 如果线程组不在同一阶段工作,则SM会增加线程组的优先级。 所公开的技术的一个优点是线程组以更高的效率执行,从而提高了处理器性能。

    TREE-BASED THREAD MANAGEMENT
    8.
    发明申请
    TREE-BASED THREAD MANAGEMENT 有权
    基于树的螺纹管理

    公开(公告)号:US20150205606A1

    公开(公告)日:2015-07-23

    申请号:US14160329

    申请日:2014-01-21

    CPC classification number: G06F9/3851 G06F9/3887 G06F9/528

    Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.

    Abstract translation: 在本发明的一个实施例中,流多处理器(SM)使用节点树来管理线程。 每个节点指定一组活动线程和一个程序计数器。 当遇到导致执行路径发散的条件指令时,SM创建对应于每个发散执行路径的子节点。 基于条件指令,SM将包含在父节点中的每个活动线程分配给最多一个子节点,并且SM暂时中断由父节点指定的执行指令。 相反,SM同时执行由子节点指定的指令。 在所有发散路径重新恢复到父路径之后,SM将恢复执行父节点指定的指令。 有利地,所公开的技术使得SM能够并行地执行发散路径,从而减少与串联在线程组之间的发散路径的常规技术相关联的不期望的程序行为。

    PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS
    9.
    发明申请
    PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS 有权
    可编程图形处理程序,用于多方案执行程序

    公开(公告)号:US20140285500A1

    公开(公告)日:2014-09-25

    申请号:US13850175

    申请日:2013-03-25

    CPC classification number: G06T1/20 G06F9/38 G06F9/3851

    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    Abstract translation: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

Patent Agency Ranking