Hybrid queue for storing instructions from fetch queue directly in out-of-order queue or temporarily in in-order queue until space is available
    1.
    发明授权
    Hybrid queue for storing instructions from fetch queue directly in out-of-order queue or temporarily in in-order queue until space is available 有权
    用于存储来自提取队列的指令的混合队列,直接在乱序队列中或临时按顺序队列直到空间可用

    公开(公告)号:US09164772B2

    公开(公告)日:2015-10-20

    申请号:US13357652

    申请日:2012-01-25

    IPC分类号: G06F9/30 G06F9/38

    摘要: A queuing apparatus having a hierarchy of queues, in one of a number of aspects, is configured to control backpressure between processors in a multiprocessor system. A fetch queue is coupled to an instruction cache and configured to store first instructions for a first processor and second instructions for a second processor in an order fetched from the instruction cache. An in-order queue is coupled to the fetch queue and configured to store the second instructions accepted from the fetch queue in response to a write indication. An out-of-order queue is coupled to the fetch queue and to the in-order queue and configured to store the second instructions accepted from the fetch queue in response to an indication that space is available in the out-of-order queue, wherein the second instructions may be accessed out-of-order with respect to other second instructions executing on different execution pipelines.

    摘要翻译: 在多个方面之一中具有队列层级的排队装置被配置为控制多处理器系统中的处理器之间的背压。 获取队列被耦合到指令高速缓存并且被配置为按照从指令高速缓存取出的顺序存储第一处理器的第一指令和第二处理器的第二指令。 顺序队列被耦合到获取队列并且被配置为响应于写指示来存储从获取队列接受的第二指令。 无序队列被耦合到所述获取队列和所述按顺序队列,并且被配置为响应于所述无序队列中的空间可用的指示来存储从所述获取队列接受的所述第二指令, 其中所述第二指令可以相对于在不同执行管线上执行的其它第二指令无序地被访问。

    Mode-based multiply-add recoding for denormal operands
    2.
    发明授权
    Mode-based multiply-add recoding for denormal operands 失效
    基于模式的乘法加法重新编码用于反常操作数

    公开(公告)号:US08447800B2

    公开(公告)日:2013-05-21

    申请号:US13026335

    申请日:2011-02-14

    IPC分类号: G06F7/38

    摘要: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream. Upon inspection of the operands, if an unnormal intermediate result or a denormal final result will not occur, the addend may be restored to the multiply-add instruction and the add instruction converted to a NOP.

    摘要翻译: 在非正常支持模式中,浮点加法器的归一化电路用于对浮点乘法器的输出进行归一化或非归一化。 每个浮点乘法指令被推测转换为乘法加法指令,加数被强制为零。 这将保留产品的价值,同时使用浮点加法器的归一化电路对产品进行规范化或非规范化。 当乘法运算的操作数可用时,它们将被检查。 如果操作数不会产生非正常的中间产品或非正常的最终产品,则通过操作数转发来抑制添加操作。 此外,每个非融合浮点乘法指令被替换为具有零加法的加法指令,并且具有原始加法指令的加数的浮点加法指令被插入到指令流中。 在检查操作数时,如果不会发生非正常的中间结果或非正常的最终结果,则可以将加数恢复为乘法指令,并将加法指令转换为NOP。

    Floating-point processor with selectable subprecision
    3.
    发明授权
    Floating-point processor with selectable subprecision 有权
    具有可选择精度的浮点处理器

    公开(公告)号:US07725519B2

    公开(公告)日:2010-05-25

    申请号:US11244492

    申请日:2005-10-05

    IPC分类号: G06F7/483

    摘要: A floating-point processor with selectable subprecision includes a register configured to store a plurality of bits in a floating-point format, a controller, and a floating-point mathematical operator. The controller is configured to select a subprecision for a floating-point operation, in response to user input. The controller is configured to determine a subset of the bits, in accordance with the selected subprecision. The floating-point operator is configured to perform the floating-point operation using only the subset of the bits. Excess bits that are not used in the floating-point operation may be forced into a low-leakage state. The output value resulting from the floating-point operation is either truncated or rounded to the selected subprecision.

    摘要翻译: 具有可选择精度的浮点处理器包括被配置为以浮点格式存储多个位的寄存器,控制器和浮点数学运算符。 控制器被配置为响应于用户输入来选择用于浮点运算的子精度。 控制器被配置为根据所选择的精度来确定比特的子集。 浮点运算符被配置为仅使用位的子集执行浮点运算。 在浮点运算中未使用的多个位可能被迫进入低泄漏状态。 由浮点运算产生的输出值要么被截断,要么舍入到选定的子精度。

    Latency insensitive FIFO signaling protocol
    4.
    发明授权
    Latency insensitive FIFO signaling protocol 有权
    延迟不敏感的FIFO信令协议

    公开(公告)号:US07454538B2

    公开(公告)日:2008-11-18

    申请号:US11128135

    申请日:2005-05-11

    IPC分类号: G06F3/00

    摘要: Data from a source domain operating at a first data rate is transferred to a FIFO in another domain operating at a different data rate. The FIFO buffers data before transfer to a sink for further processing or storage. A source side counter tracks space available in the FIFO. In disclosed examples, the initial counter value corresponds to FIFO depth. The counter decrements in response to a data ready signal from the source domain, without delay. The counter increments in response to signaling from the sink domain of a read of data off the FIFO. Hence, incrementing is subject to the signaling latency between domains. The source may send one more beat of data when the counter indicates the FIFO is full. The last beat of data is continuously sent from the source until it is indicated that a FIFO position became available; effectively providing one more FIFO position.

    摘要翻译: 来自以第一数据速率运行的源域的数据被传送到以不同数据速率工作的另一个域中的FIFO。 FIFO在传输到宿之前缓冲数据以进一步处理或存储。 源端计数器跟踪FIFO中可用的空间。 在公开的示例中,初始计数器值对应于FIFO深度。 响应于来自源域的数据就绪信号,计数器无延迟地递减。 响应于来自接收器域的信令从FIFO读取数据,计数器递增。 因此,增量受到域之间的信令等待时间的限制。 当计数器指示FIFO已满时,源可能再发送一次数据。 数据的最后一次节拍从源头连续发送到指示FIFO位置可用为止; 有效提供一个FIFO位置。

    MODE-BASED MULTIPLY-ADD RECODING FOR DENORMAL OPERANDS
    5.
    发明申请
    MODE-BASED MULTIPLY-ADD RECODING FOR DENORMAL OPERANDS 失效
    基于模式的多媒体加法解调用于通用操作

    公开(公告)号:US20110137970A1

    公开(公告)日:2011-06-09

    申请号:US13026335

    申请日:2011-02-14

    IPC分类号: G06F7/487 G06F7/485

    摘要: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream. Upon inspection of the operands, if an unnormal intermediate result or a denormal final result will not occur, the addend may be restored to the multiply-add instruction and the add instruction converted to a NOP.

    摘要翻译: 在非正常支持模式中,浮点加法器的归一化电路用于对浮点乘法器的输出进行归一化或非归一化。 每个浮点乘法指令被推测转换为乘法加法指令,加数被强制为零。 这将保留产品的价值,同时使用浮点加法器的归一化电路对产品进行规范化或非规范化。 当乘法运算的操作数可用时,它们将被检查。 如果操作数不会产生非正常的中间产品或非正常的最终产品,则通过操作数转发来抑制添加操作。 此外,每个非融合浮点乘法指令被替换为具有零加法的加法指令,并且具有原始加法指令的加数的浮点加法指令被插入到指令流中。 在检查操作数时,如果不会发生非正常的中间结果或非正常的最终结果,则可以将加数恢复为乘法指令,并将加法指令转换为NOP。

    Software Selectable Adjustment of SIMD Parallelism
    6.
    发明申请
    Software Selectable Adjustment of SIMD Parallelism 有权
    软件可选择的SIMD并行调整

    公开(公告)号:US20100146315A1

    公开(公告)日:2010-06-10

    申请号:US12706987

    申请日:2010-02-17

    摘要: Selective power control of one or more processing elements matches a degree of parallelism to requirements of a task performed in a highly parallel programmable data processor. For example, when program operations require less than the full width of the data path, a software instruction of the program sets a mode of operation requiring a subset of the parallel processing capacity. At least one parallel processing element, that is not needed, can be shut down to conserve power. At a later time, when the added capacity is needed, execution of another software instruction sets the mode of operation to that of the wider data path, typically the full width, and the mode change reactivates the previously shut-down processing element.

    摘要翻译: 一个或多个处理元件的选择性功率控制与高度并行的可编程数据处理器中执行的任务的要求相匹配。 例如,当程序操作需要小于数据路径的全宽时,该程序的软件指令设置需要并行处理能力子集的操作模式。 可以关闭至少一个不需要的并行处理元件以节省功率。 稍后,当需要添加容量时,另一软件指令的执行将操作模式设置为较宽数据路径的操作模式,通常为全宽,并且模式更改重新激活先前关闭的处理元素。

    Latency Insensitive FIFO Signaling Protocol
    7.
    发明申请
    Latency Insensitive FIFO Signaling Protocol 有权
    延迟不敏感的FIFO信令协议

    公开(公告)号:US20080281996A1

    公开(公告)日:2008-11-13

    申请号:US12179970

    申请日:2008-07-25

    IPC分类号: G06F13/38 G06F3/00

    摘要: Data from a source domain operating at a first data rate is transferred to a FIFO in another domain operating at a different data rate. The FIFO buffers data before transfer to a sink for further processing or storage. A source side counter tracks space available in the FIFO. In disclosed examples, the initial counter value corresponds to FIFO depth. The counter decrements in response to a data ready signal from the source domain, without delay. The counter increments in response to signaling from the sink domain of a read of data off the FIFO. Hence, incrementing is subject to the signaling latency between domains. The source may send one more beat of data when the counter indicates the FIFO is full. The last beat of data is continuously sent from the source until it is indicated that a FIFO position became available; effectively providing one more FIFO position.

    摘要翻译: 来自以第一数据速率运行的源域的数据被传送到以不同数据速率工作的另一个域中的FIFO。 FIFO在传输到宿之前缓冲数据以进一步处理或存储。 源端计数器跟踪FIFO中可用的空间。 在公开的示例中,初始计数器值对应于FIFO深度。 响应于来自源域的数据就绪信号,计数器无延迟地递减。 响应于来自接收器域的信令从FIFO读取数据,计数器递增。 因此,增量受到域之间的信令等待时间的限制。 当计数器指示FIFO已满时,源可能再发送一次数据。 数据的最后一次节拍从源头连续发送到指示FIFO位置可用为止; 有效提供一个FIFO位置。

    Method and system for reducing power consumption of a programmable processor
    8.
    发明授权
    Method and system for reducing power consumption of a programmable processor 有权
    降低可编程处理器功耗的方法和系统

    公开(公告)号:US07386747B2

    公开(公告)日:2008-06-10

    申请号:US11126442

    申请日:2005-05-10

    IPC分类号: G06F1/32

    摘要: Control logic monitors use of a particular functional element (e.g., a divider, or multiplier or the like) in a programmable processor, and the control logic powers the unit down when it has not been used for a specified time period. A counter (local or central) and time threshold determine when the period has elapsed without use of the element. The control logic also monitors how soon the functional unit is woken up again, to determine if power control is causing thrashing. Upon the determination of such thrashing, the unit automatically adjusts its threshold period, to minimize thrashing. In an example of the logic, when it determines that it is being too conservative, it lowers the threshold. Mode bits may allow the programmer to override the power-down logic to either keep the logic always powered-up, or always powered-down.

    摘要翻译: 控制逻辑监视可编程处理器中的特定功能元件(例如,分频器或乘法器等)的使用,并且当控制逻辑在未被使用指定的时间周期时,该控制逻辑使该单元断电。 计数器(本地或中央)和时间阈值确定何时使用该元素。 控制逻辑还监控功能单元再次唤醒多久,以确定功率控制是否引起颠簸。 在确定此类抖动后,本机将自动调整其阈值周期,以最大限度地减少抖动。 在逻辑的一个例子中,当它确定它太保守时,它会降低阈值。 模式位可以允许程序员覆盖掉电逻辑,以使逻辑总是上电,或者总是断电。

    Software selectable adjustment of SIMD parallelism
    9.
    发明授权
    Software selectable adjustment of SIMD parallelism 有权
    软件可选择调整SIMD并行性

    公开(公告)号:US08799627B2

    公开(公告)日:2014-08-05

    申请号:US13350949

    申请日:2012-01-16

    IPC分类号: G06F9/00 G06F1/00

    摘要: Selective power control of one or more processing elements matches a degree of parallelism to requirements of a task performed in a highly parallel programmable data processor. For example, when program operations require less than the full width of the data path, a software instruction of the program sets a mode of operation requiring a subset of the parallel processing capacity. At least one parallel processing element, that is not needed, can be shut down to conserve power. At a later time, when the added capacity is needed, execution of another software instruction sets the mode of operation to that of the wider data path, typically the full width, and the mode change reactivates the previously shut-down processing element.

    摘要翻译: 一个或多个处理元件的选择性功率控制与高度并行的可编程数据处理器中执行的任务的要求相匹配。 例如,当程序操作需要小于数据路径的全宽时,该程序的软件指令设置需要并行处理能力子集的操作模式。 可以关闭至少一个不需要的并行处理元件以节省功率。 稍后,当需要添加容量时,另一软件指令的执行将操作模式设置为较宽数据路径的操作模式,通常为全宽,并且模式更改重新激活先前关闭的处理元素。

    Processor with Hazard Tracking Employing Register Range Compares
    10.
    发明申请
    Processor with Hazard Tracking Employing Register Range Compares 审中-公开
    具有危险追踪使用寄存器范围比较的处理器

    公开(公告)号:US20130173886A1

    公开(公告)日:2013-07-04

    申请号:US13343010

    申请日:2012-01-04

    IPC分类号: G06F9/30 G06F9/38

    CPC分类号: G06F9/3838

    摘要: Systems and methods for tracking data hazards in a processor. The processor comprises a pipelined architecture configured to execute a first instruction and a second instruction, wherein the second instruction is older than the first instruction. At least one of the first and second instructions comprises at least one operand expressed as a range of registers. Hazard detection logic is configured to compare the first instruction and the second instruction to determine if there is a data hazard, prior to expanding the second instruction.

    摘要翻译: 用于跟踪处理器中的数据危害的系统和方法。 处理器包括被配置为执行第一指令和第二指令的流水线架构,其中第二指令比第一指令早。 第一和第二指令中的至少一个指令包括表示为寄存器范围的至少一个操作数。 危险检测逻辑被配置为在扩展第二条指令之前比较第一条指令和第二条指令以确定是否存在数据危险。