APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER
    1.
    发明申请
    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER 审中-公开
    可配置的基于相位优先级调度器的方法

    公开(公告)号:US20160224386A1

    公开(公告)日:2016-08-04

    申请号:US14613339

    申请日:2015-02-03

    CPC classification number: G06F9/5038 G06F9/3851 G06F9/3887 G06F9/4881

    Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

    Abstract translation: 并行处理子系统中的流多处理器(SM)调度多个线程中的优先级。 SM检索与线程组相关联的优先级描述符,并确定线程组和第二线程组是否都处于同一阶段。 如果是,则该方法确定线程组的优先级描述符是否指示比第二线程组的优先级描述符更高的优先级。 如果是这样,则SM相对于第二个线程组倾斜线程组,使得线程组以不同的阶段运行,否则SM增加了线程组的优先级。 线程组不在同一个阶段运行,则SM会增加线程组的优先级。 所公开的技术的一个优点是线程组以更高的效率执行,从而提高了处理器性能。

    TECHNIQUES FOR EFFICIENTLY SYNCHRONIZING MULTIPLE PROGRAM THREADS

    公开(公告)号:US20220391264A1

    公开(公告)日:2022-12-08

    申请号:US17338377

    申请日:2021-06-03

    Abstract: Various embodiments include a parallel processing computer system that enables parallel instances of a program to synchronize at disparate addresses in memory. When the parallel program instances need to exchange data, the program instances synchronize based on a mask that identifies the program instances that are synchronizing. As each program instance reaches the point of synchronization, the program instance blocks and waits for all other program instances to reach the point of synchronization. When all program instances have reached the point of synchronization, at least one program instance executes a synchronous operation to exchange data. The program instances then continue execution at respective and disparate return addresses.

    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER
    3.
    发明申请
    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER 有权
    可配置的基于相位优先级调度器的方法

    公开(公告)号:US20140189698A1

    公开(公告)日:2014-07-03

    申请号:US13728828

    申请日:2012-12-27

    CPC classification number: G06F9/5038 G06F9/3851 G06F9/3887 G06F9/4881

    Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

    Abstract translation: 并行处理子系统中的流多处理器(SM)调度多个线程中的优先级。 SM检索与线程组相关联的优先级描述符,并确定线程组和第二线程组是否都处于同一阶段。 如果是,则该方法确定线程组的优先级描述符是否指示比第二线程组的优先级描述符更高的优先级。 如果是这样,则SM相对于第二个线程组倾斜线程组,使得线程组以不同的阶段运行,否则SM增加了线程组的优先级。 如果线程组不在同一阶段工作,则SM会增加线程组的优先级。 所公开的技术的一个优点是线程组以更高的效率执行,从而提高了处理器性能。

    APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER

    公开(公告)号:US20170192822A9

    公开(公告)日:2017-07-06

    申请号:US14613339

    申请日:2015-02-03

    CPC classification number: G06F9/5038 G06F9/3851 G06F9/3887 G06F9/4881

    Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

    DYNAMICALLY DETECTING UNIFORMITY AND ELIMINATING REDUNDANT COMPUTATIONS TO REDUCE POWER CONSUMPTION
    5.
    发明申请
    DYNAMICALLY DETECTING UNIFORMITY AND ELIMINATING REDUNDANT COMPUTATIONS TO REDUCE POWER CONSUMPTION 审中-公开
    动态检测均匀性,消除冗余计算,减少耗电量

    公开(公告)号:US20150100764A1

    公开(公告)日:2015-04-09

    申请号:US14048647

    申请日:2013-10-08

    CPC classification number: G06F9/30072 G06F9/3836 G06F9/3851 G06F9/3887

    Abstract: One embodiment of the present invention includes techniques to decrease power consumption by reducing the number of redundant operations performed. In operation, a streamlining multiprocessor (SM) identifies uniform groups of threads that, when executed, apply the same deterministic operation to uniform sets of input operands. Within each uniform group of threads, the SM designates one thread as the anchor thread. The SM disables execution units assigned to all of the threads except the anchor thread. The anchor execution unit, assigned to the anchor thread, executes the operation on the uniform set of input operands. Subsequently, the SM sets the outputs of the non-anchor threads included in the uniform group of threads to equal the value of the anchor execution unit output. Advantageously, by exploiting the uniformity of data to reduce the number of execution units that execute, the SM dramatically reduces the power consumption compared to conventional SMs.

    Abstract translation: 本发明的一个实施例包括通过减少执行的冗余操作的数量来降低功耗的技术。 在操作中,精简多处理器(SM)识别统一的线程组,当被执行时,该组线程对于均匀的输入操作数集合应用相同的确定性操作。 在每个均匀的螺纹组内,SM指定一根螺纹作为锚定螺纹。 SM禁用分配给所有线程的执行单元,除了锚点线程。 分配给锚线程的锚执行单元对均匀的输入操作数集合执行操作。 随后,SM将包括在统一的线程组中的非锚线程的输出设置为等于锚执行单元输出的值。 有利地,通过利用数据的均匀性来减少执行的执行单元的数量,与常规SM相比,SM大大降低了功耗。

Patent Agency Ranking