DYNAMICALLY DETECTING UNIFORMITY AND ELIMINATING REDUNDANT COMPUTATIONS TO REDUCE POWER CONSUMPTION
    1.
    发明申请
    DYNAMICALLY DETECTING UNIFORMITY AND ELIMINATING REDUNDANT COMPUTATIONS TO REDUCE POWER CONSUMPTION 审中-公开
    动态检测均匀性,消除冗余计算,减少耗电量

    公开(公告)号:US20150100764A1

    公开(公告)日:2015-04-09

    申请号:US14048647

    申请日:2013-10-08

    CPC classification number: G06F9/30072 G06F9/3836 G06F9/3851 G06F9/3887

    Abstract: One embodiment of the present invention includes techniques to decrease power consumption by reducing the number of redundant operations performed. In operation, a streamlining multiprocessor (SM) identifies uniform groups of threads that, when executed, apply the same deterministic operation to uniform sets of input operands. Within each uniform group of threads, the SM designates one thread as the anchor thread. The SM disables execution units assigned to all of the threads except the anchor thread. The anchor execution unit, assigned to the anchor thread, executes the operation on the uniform set of input operands. Subsequently, the SM sets the outputs of the non-anchor threads included in the uniform group of threads to equal the value of the anchor execution unit output. Advantageously, by exploiting the uniformity of data to reduce the number of execution units that execute, the SM dramatically reduces the power consumption compared to conventional SMs.

    Abstract translation: 本发明的一个实施例包括通过减少执行的冗余操作的数量来降低功耗的技术。 在操作中,精简多处理器(SM)识别统一的线程组,当被执行时,该组线程对于均匀的输入操作数集合应用相同的确定性操作。 在每个均匀的螺纹组内,SM指定一根螺纹作为锚定螺纹。 SM禁用分配给所有线程的执行单元,除了锚点线程。 分配给锚线程的锚执行单元对均匀的输入操作数集合执行操作。 随后,SM将包括在统一的线程组中的非锚线程的输出设置为等于锚执行单元输出的值。 有利地,通过利用数据的均匀性来减少执行的执行单元的数量,与常规SM相比,SM大大降低了功耗。

    EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE
    2.
    发明申请
    EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE 审中-公开
    通过分布式指令集架构实现高效

    公开(公告)号:US20150113254A1

    公开(公告)日:2015-04-23

    申请号:US14061666

    申请日:2013-10-23

    CPC classification number: G06F9/3836

    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.

    Abstract translation: 子系统被配置为支持具有主和辅助执行管线的分布式指令集体系结构。 主要执行流水线支持经常发布的分布式指令集架构中的指令子集的执行。 辅助执行流水线支持执行分布式指令集体系结构中不太频繁发布的指令的另一子集。 两个执行流水线也支持执行FFMA指令以及分布式指令集体系结构中的一个常见的指令子集。 当调度所请求的指令时,指令调度单元被配置为基于各种标准在两个执行流水线之间进行选择。 这些标准可以包括能够执行指令的功率效率和执行单元的可用性以支持指令的执行。

Patent Agency Ranking