EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE
    1.
    发明申请
    EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE 审中-公开
    通过分布式指令集架构实现高效

    公开(公告)号:US20150113254A1

    公开(公告)日:2015-04-23

    申请号:US14061666

    申请日:2013-10-23

    CPC classification number: G06F9/3836

    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.

    Abstract translation: 子系统被配置为支持具有主和辅助执行管线的分布式指令集体系结构。 主要执行流水线支持经常发布的分布式指令集架构中的指令子集的执行。 辅助执行流水线支持执行分布式指令集体系结构中不太频繁发布的指令的另一子集。 两个执行流水线也支持执行FFMA指令以及分布式指令集体系结构中的一个常见的指令子集。 当调度所请求的指令时,指令调度单元被配置为基于各种标准在两个执行流水线之间进行选择。 这些标准可以包括能够执行指令的功率效率和执行单元的可用性以支持指令的执行。

    HIERARCHICAL STAGING AREAS FOR SCHEDULING THREADS FOR EXECUTION
    2.
    发明申请
    HIERARCHICAL STAGING AREAS FOR SCHEDULING THREADS FOR EXECUTION 审中-公开
    用于调度执行螺纹的分级分区

    公开(公告)号:US20150113538A1

    公开(公告)日:2015-04-23

    申请号:US14061170

    申请日:2013-10-23

    CPC classification number: G06F9/5011 G06F2209/507

    Abstract: One embodiment of the present invention is a computer-implemented method for scheduling a thread group for execution on a processing engine that includes identifying a first thread group included in a first set of thread groups that can be issued for execution on the processing engine, where the first thread group includes one or more threads. The method also includes transferring the first thread group from the first set of thread groups to a second set of thread groups, allocating hardware resources to the first thread group, and selecting the first thread group from the second set of thread groups for execution on the processing engine. One advantage of the disclosed technique is that a scheduler only allocates limited hardware resources to thread groups that are, in fact, ready to be issued for execution, thereby conserving those resources in a manner that is generally more efficient than conventional techniques.

    Abstract translation: 本发明的一个实施例是一种用于在处理引擎上调度用于执行的线程组的计算机实现的方法,该处理引擎包括识别包括在可被发行用于在处理引擎上执行的第一组线程组中的第一线程组,其中 第一个线程组包括一个或多个线程。 该方法还包括将第一线程组从第一组线程组传送到第二组线程组,向第一线程组分配硬件资源,以及从第二组线程组中选择第一线程组以在 处理引擎。 所公开技术的一个优点是调度器仅将有限的硬件资源分配给事实上准备被发行用于执行的线程组,从而以通常比常规技术更有效的方式来保存那些资源。

    TECHNIQUE FOR PERFORMING ARBITRARY WIDTH INTEGER ARITHMETIC OPERATIONS USING FIXED WIDTH ELEMENTS
    3.
    发明申请
    TECHNIQUE FOR PERFORMING ARBITRARY WIDTH INTEGER ARITHMETIC OPERATIONS USING FIXED WIDTH ELEMENTS 有权
    使用固定宽度元素执行仲裁宽整数算术运算的技术

    公开(公告)号:US20150081753A1

    公开(公告)日:2015-03-19

    申请号:US14026829

    申请日:2013-09-13

    CPC classification number: G06F7/525 G06F2207/3824

    Abstract: One embodiment of the present invention includes a method for performing arithmetic operations on arbitrary width integers using fixed width elements. The method includes receiving a plurality of input operands, segmenting each input operand into multiple sectors, performing a plurality of multiply-add operations based on the multiple sectors to generate a plurality of multiply-add operation results, and combining the multiply-add operation results to generate a final result. One advantage of the disclosed embodiments is that, by using a common fused floating point multiply-add unit to perform arithmetic operations on integers of arbitrary width, the method avoids the area and power penalty of having additional dedicated integer units.

    Abstract translation: 本发明的一个实施例包括使用固定宽度元素对任意宽度整数执行算术运算的方法。 该方法包括接收多个输入操作数,将每个输入操作数分割成多个扇区,基于多个扇区执行多个乘法运算,生成多个乘法运算结果,并组合乘法运算结果 以产生最终结果。 所公开的实施例的一个优点是,通过使用公共融合浮点乘法单元对任意宽度的整数执行算术运算,该方法避免了具有附加专用整数单位的面积和功率损失。

Patent Agency Ranking