Processing Unit Incorporating Issue Rate-Based Predictive Thermal Management
    32.
    发明申请
    Processing Unit Incorporating Issue Rate-Based Predictive Thermal Management 审中-公开
    加工单元结合发行费率预测热管理

    公开(公告)号:US20090182986A1

    公开(公告)日:2009-07-16

    申请号:US12015174

    申请日:2008-01-16

    Abstract: A circuit arrangement and method utilize an issue rate-based predictive thermal management technique in a microprocessor or other integrated circuit that tracks the rate in which instructions are issued to one or more execution units in the processing unit, and selectively delays the issuance of subsequent instructions to the execution unit(s) based upon the tracked issue rate to predictively control the thermal output of the integrated circuit.

    Abstract translation: 电路布置和方法利用微处理器或其他集成电路中的基于问题率的预测热管理技术,该微处理器或其他集成电路跟踪指令被发送到处理单元中的一个或多个执行单元的速率,并且选择性地延迟后续指令的发布 基于跟踪的发布速率来预测地控制集成电路的热输出的执行单元。

    Method and Apparatus for an Area Efficient Transcendental Estimate Algorithm
    33.
    发明申请
    Method and Apparatus for an Area Efficient Transcendental Estimate Algorithm 失效
    用于区域有效超验估计算法的方法和装置

    公开(公告)号:US20090070398A1

    公开(公告)日:2009-03-12

    申请号:US11851658

    申请日:2007-09-07

    CPC classification number: G06F7/548

    Abstract: A method, computer-readable medium, and an apparatus for generating a transcendental value. The method includes receiving an input containing an input value and an opcode and determining whether the opcode corresponds to a trigonometric operation or a power-of-two operation. The method also includes calculating a fractional value and an integer value from the input value, generating the transcendental value based on the fractional value by adding at least a portion of the fractional value with at least one of a shifted fractional value produced by shifting the portion of the fractional value and a constant value, and providing the transcendental value in response to the request. In this fashion, the same circuit area may be used to carry out both trigonometric and power-of-two calculations, leading to greater circuit area savings and performance advantages while not sacrificing significant accuracy.

    Abstract translation: 一种用于产生超验值的方法,计算机可读介质和装置。 该方法包括接收包含输入值和操作码的输入,并确定操作码是否对应于三角运算或二进制运算。 该方法还包括从输入值计算分数值和整数值,通过将分数值的至少一部分与通过移动部分产生的移位分数值中的至少一个相加而基于分数值生成超越值 的分数值和恒定值,并且响应于该请求提供超验值。 以这种方式,可以使用相同的电路面积来执行三角和二次幂计算,导致更大的电路面积节省和性能优点,而不牺牲显着的精度。

    Operand Multiplexor Control Modifier Instruction in a Fine Grain Multithreaded Vector Microprocessor
    34.
    发明申请
    Operand Multiplexor Control Modifier Instruction in a Fine Grain Multithreaded Vector Microprocessor 失效
    精细多线程向量微处理器中的操作数多路复用器控制修改器指令

    公开(公告)号:US20080122854A1

    公开(公告)日:2008-05-29

    申请号:US11564072

    申请日:2006-11-28

    CPC classification number: G06T1/20

    Abstract: The present invention is generally related to the field of image processing, and more specifically to an instruction set for processing images. Vector processing may involve rearranging vector operands in one or more source registers prior to performing vector operations. Typically, rearranging of operands in source registers is done by issuing a plurality of permute instructions that require excessive usage of temporary registers. Furthermore, the permute instructions may cause dependencies between instructions executing in a pipeline, thereby adversely affecting performance. Embodiments of the invention provide a level of muxing between a register file and a vector unit that allow for rearrangement of vector operands in source registers prior to providing the operands to the vector unit, thereby obviating the need for permute instructions.

    Abstract translation: 本发明通常涉及图像处理领域,更具体地涉及用于处理图像的指令集。 矢量处理可以包括在执行向量操作之前在一个或多个源寄存器中重新排列向量操作数。 通常,通过发出需要临时寄存器过度使用的多个置换指令来完成源寄存器中操作数的重新排列。 此外,置换指令可能导致在流水线中执行的指令之间的相关性,从而不利地影响性能。 本发明的实施例提供了一种在寄存器文件和向量单元之间的复用水平,其允许在将操作数提供给向量单元之前重新排列源寄存器中的向量操作数,从而避免了对置换指令的需要。

    EXECUTION UNIT WITH INLINE PSEUDORANDOM NUMBER GENERATOR
    35.
    发明申请
    EXECUTION UNIT WITH INLINE PSEUDORANDOM NUMBER GENERATOR 审中-公开
    具有内置PSEUDORANDOM数字发生器的执行单元

    公开(公告)号:US20120303691A1

    公开(公告)日:2012-11-29

    申请号:US13556464

    申请日:2012-07-24

    CPC classification number: G06F9/3851 G06F9/30014 G06F9/30181

    Abstract: A circuit arrangement and method couple a hardware-based pseudorandom number generator (PRNG) to an execution unit in such a manner that pseudorandom numbers generated by the PRNG may be selectively output to the execution unit for use as an operand during the execution of instructions by the execution unit. A PRNG may be coupled to an input of an operand multiplexer that outputs to an operand input of an execution unit so that operands provided by instructions supplied to the execution unit are selectively overridden with pseudorandom numbers generated by the PRNG. Furthermore, overridden operands provided by instructions supplied to the execution unit may be used as seed values for the PRNG.

    Abstract translation: 电路布置和方法将基于硬件的伪随机数生成器(PRNG)耦合到执行单元,使得由PRNG生成的伪随机数可以被选择性地输出到执行单元,以在执行指令期间用作操作数, 执行单元。 PRNG可以耦合到操作数多路复用器的输入,该输入输出到执行单元的操作数输入,使得由提供给执行单元的指令提供的操作数被PRNG生成的伪随机数选择性地覆盖。 此外,提供给执行单元的指令提供的覆盖操作数可以用作PRNG的种子值。

    Dynamic merging of pipeline stages in an execution pipeline to reduce power consumption
    37.
    发明授权
    Dynamic merging of pipeline stages in an execution pipeline to reduce power consumption 有权
    在执行管道中动态合并流水线阶段以降低功耗

    公开(公告)号:US08291201B2

    公开(公告)日:2012-10-16

    申请号:US12125135

    申请日:2008-05-22

    Abstract: A pipelined execution unit incorporates one or more low power modes that reduce power consumption by dynamically merging pipeline stages in an execution pipeline together with one another. In particular, the execution logic in successive pipeline stages in an execution pipeline may be dynamically merged together by setting one or more latches that are intermediate to such pipeline stages to a transparent state such that the output of the pipeline stage preceding such latches is passed to the subsequent pipeline stage during the same clock cycle so that both such pipeline stages effectively perform steps for the same instruction during each clock cycle. Then, with the selected pipeline stages merged, the power consumption of the execution pipeline can be reduced (e.g., by reducing the clock frequency and/or operating voltage of the execution pipeline), often with minimal adverse impact on performance.

    Abstract translation: 流水线执行单元包括一个或多个低功率模式,其通过在执行流水线中彼此动态合并流水线阶段来降低功耗。 特别地,执行流水线中的连续流水线阶段中的执行逻辑可以通过将一个或多个这样的流水线级中间的锁存器设置为透明状态来动态地合并在一起,使得在这种锁存器之前的流水线级的输出被传递到 在相同时钟周期期间的后续流水线级,使得这两个流水线级在每个时钟周期期间有效地执行相同指令的步骤。 然后,在所选择的流水线级合并的情况下,可以减少执行流水线的功耗(例如,通过降低执行流水线的时钟频率和/或操作电压),通常对性能的不利影响最小。

    Area efficient transcendental estimate algorithm
    38.
    发明授权
    Area efficient transcendental estimate algorithm 失效
    区域有效超验估计算法

    公开(公告)号:US08275821B2

    公开(公告)日:2012-09-25

    申请号:US11851658

    申请日:2007-09-07

    CPC classification number: G06F7/548

    Abstract: A method, computer-readable medium, and an apparatus for generating a transcendental value. The method includes receiving an input containing an input value and an opcode and determining whether the opcode corresponds to a trigonometric operation or a power-of-two operation. The method also includes calculating a fractional value and an integer value from the input value, generating the transcendental value based on the fractional value by adding at least a portion of the fractional value with at least one of a shifted fractional value produced by shifting the portion of the fractional value and a constant value, and providing the transcendental value in response to the request. In this fashion, the same circuit area may be used to carry out both trigonometric and power-of-two calculations, leading to greater circuit area savings and performance advantages while not sacrificing significant accuracy.

    Abstract translation: 一种用于产生超验值的方法,计算机可读介质和装置。 该方法包括接收包含输入值和操作码的输入,并确定操作码是否对应于三角运算或二进制运算。 该方法还包括从输入值计算分数值和整数值,通过将分数值的至少一部分与通过移动部分产生的移位分数值中的至少一个相加而基于分数值生成超越值 的分数值和恒定值,并且响应于该请求提供超验值。 以这种方式,可以使用相同的电路面积来执行三角和二次幂计算,导致更大的电路面积节省和性能优点,而不牺牲显着的精度。

    Tree Insertion Depth Adjustment Based on View Frustrum and Distance Culling
    39.
    发明申请
    Tree Insertion Depth Adjustment Based on View Frustrum and Distance Culling 有权
    基于视图和距离剔除的树插入深度调整

    公开(公告)号:US20120236001A1

    公开(公告)日:2012-09-20

    申请号:US13476876

    申请日:2012-05-21

    CPC classification number: G06T15/06 G06T17/005

    Abstract: A computer-implemented method includes initializing a driver associated with an input/output adapter in response to receiving an initialize driver request from a client application. The computer-implemented method includes initializing the input/output adapter to enable adapter capabilities of the input/output adapter to be determined. The computer-implemented method also includes determining the adapter capabilities of the input/output adapter. The computer-implemented method further includes determining slot capabilities of a slot associated with the input/output adapter. The computer-implemented method also includes setting configurable capabilities of the input/output adapter based on the adapter capabilities and the slot capabilities.

    Abstract translation: 计算机实现的方法包括初始化与输入/输出适配器相关联的驱动程序以响应于从客户端应用程序接收到初始化驱动程序请求。 计算机实现的方法包括初始化输入/输出适配器以确定输入/输出适配器的适配器能力。 计算机实现的方法还包括确定输入/输出适配器的适配器能力。 计算机实现的方法还包括确定与输入/输出适配器相关联的时隙的时隙能力。 计算机实现的方法还包括基于适配器能力和时隙能力来设置输入/输出适配器的可配置功能。

    Structural power reduction in multithreaded processor
    40.
    发明授权
    Structural power reduction in multithreaded processor 失效
    多线程处理器中的结构功耗降低

    公开(公告)号:US08140830B2

    公开(公告)日:2012-03-20

    申请号:US12125278

    申请日:2008-05-22

    CPC classification number: G06F9/5044 G06F9/3851 G06F9/5094 Y02D10/22

    Abstract: A circuit arrangement and method utilize a plurality of execution units having different power and performance characteristics and capabilities within a multithreaded processor core, and selectively route instructions having different performance requirements to different execution units based upon those performance requirements. As such, instructions that have high performance requirements, such as instructions associated with primary tasks or time sensitive tasks, can be routed to a higher performance execution unit to maximize performance when executing those instructions, while instructions that have low performance requirements, such as instructions associated with background tasks or non-time sensitive tasks, can be routed to a reduced power execution unit to reduce the power consumption (and associated heat generation) associated with executing those instructions.

    Abstract translation: 电路布置和方法利用在多线程处理器核心内具有不同功率和性能特征和能力的多个执行单元,并且基于那些性能要求,有选择地将具有不同性能要求的指令路由到不同的执行单元。 因此,具有高性能要求的指令(例如与主要任务或时间敏感任务相关联的指令)可以被路由到更高性能的执行单元,以在执行那些指令时最大化性能,而具有低性能要求的指令,例如指令 与后台任务或非时间敏感任务相关联,可以被路由到减少的功率执行单元以减少与执行这些指令相关联的功耗(和相关联的发热)。

Patent Agency Ranking