Hybrid queue for storing instructions from fetch queue directly in out-of-order queue or temporarily in in-order queue until space is available
    1.
    发明授权
    Hybrid queue for storing instructions from fetch queue directly in out-of-order queue or temporarily in in-order queue until space is available 有权
    用于存储来自提取队列的指令的混合队列,直接在乱序队列中或临时按顺序队列直到空间可用

    公开(公告)号:US09164772B2

    公开(公告)日:2015-10-20

    申请号:US13357652

    申请日:2012-01-25

    IPC分类号: G06F9/30 G06F9/38

    摘要: A queuing apparatus having a hierarchy of queues, in one of a number of aspects, is configured to control backpressure between processors in a multiprocessor system. A fetch queue is coupled to an instruction cache and configured to store first instructions for a first processor and second instructions for a second processor in an order fetched from the instruction cache. An in-order queue is coupled to the fetch queue and configured to store the second instructions accepted from the fetch queue in response to a write indication. An out-of-order queue is coupled to the fetch queue and to the in-order queue and configured to store the second instructions accepted from the fetch queue in response to an indication that space is available in the out-of-order queue, wherein the second instructions may be accessed out-of-order with respect to other second instructions executing on different execution pipelines.

    摘要翻译: 在多个方面之一中具有队列层级的排队装置被配置为控制多处理器系统中的处理器之间的背压。 获取队列被耦合到指令高速缓存并且被配置为按照从指令高速缓存取出的顺序存储第一处理器的第一指令和第二处理器的第二指令。 顺序队列被耦合到获取队列并且被配置为响应于写指示来存储从获取队列接受的第二指令。 无序队列被耦合到所述获取队列和所述按顺序队列,并且被配置为响应于所述无序队列中的空间可用的指示来存储从所述获取队列接受的所述第二指令, 其中所述第二指令可以相对于在不同执行管线上执行的其它第二指令无序地被访问。

    Processor with a coprocessor having early access to not-yet issued instructions
    2.
    发明授权
    Processor with a coprocessor having early access to not-yet issued instructions 有权
    具有协处理器的处理器具有早期访问尚未发布的指令

    公开(公告)号:US09304774B2

    公开(公告)日:2016-04-05

    申请号:US13363541

    申请日:2012-02-01

    IPC分类号: G06F9/30 G06F9/38

    摘要: Apparatus and methods provide early access of instructions. A fetch queue is coupled to an instruction cache and configured to store a mix of processor instructions for a first processor and coprocessor instructions for a second processor. A coprocessor instruction selector is coupled to the fetch queue and configured to copy coprocessor instructions from the fetch queue. A queue is coupled to the coprocessor instruction selector and from which coprocessor instructions are accessed for execution before the coprocessor instruction is issued to the first processor. Execution of the copied coprocessor instruction is started in the coprocessor before the coprocessor instruction is issued to a processor. The execution of the copied coprocessor instruction is completed based on information received from the processor after the coprocessor instruction has been issued to the processor.

    摘要翻译: 装置和方法提供了指令的早期访问。 获取队列被耦合到指令高速缓存并且被配置为存储用于第一处理器的处理器指令和用于第二处理器的协处理器指令的混合。 协处理器指令选择器被耦合到获取队列并被配置为从提取队列中复制协处理器指令。 在协处理器指令被发送到第一处理器之前,队列被耦合到协处理器指令选择器并且从哪个协处理器指令被访问以执行。 在将协处理器指令发送到处理器之前,在协处理器中启动复制的协处理器指令的执行。 复制的协处理器指令的执行是在协处理器指令发出到处理器之后基于从处理器接收到的信息完成的。

    Processor with a Hybrid Instruction Queue
    3.
    发明申请
    Processor with a Hybrid Instruction Queue 有权
    具有混合指令队列的处理器

    公开(公告)号:US20120204004A1

    公开(公告)日:2012-08-09

    申请号:US13357652

    申请日:2012-01-25

    摘要: A queuing apparatus having a hierarchy of queues, in one of a number of aspects, is configured to control backpressure between processors in a multiprocessor system. A fetch queue is coupled to an instruction cache and configured to store first instructions for a first processor and second instructions for a second processor in an order fetched from the instruction cache. An in-order queue is coupled to the fetch queue and configured to store the second instructions accepted from the fetch queue in response to a write indication. An out-of-order queue is coupled to the fetch queue and to the in-order queue and configured to store the second instructions accepted from the fetch queue in response to an indication that space is available in the out-of-order queue, wherein the second instructions may be accessed out-of-order with respect to other second instructions executing on different execution pipelines.

    摘要翻译: 在多个方面之一中具有队列层级的排队装置被配置为控制多处理器系统中的处理器之间的背压。 获取队列被耦合到指令高速缓存并且被配置为按照从指令高速缓存取出的顺序存储第一处理器的第一指令和第二处理器的第二指令。 顺序队列被耦合到获取队列并且被配置为响应于写指示来存储从获取队列接受的第二指令。 无序队列被耦合到所述获取队列和所述按顺序队列,并且被配置为响应于所述无序队列中的空间可用的指示来存储从所述获取队列接受的所述第二指令, 其中所述第二指令可以相对于在不同执行管线上执行的其它第二指令无序地被访问。

    Processor with Hazard Tracking Employing Register Range Compares
    4.
    发明申请
    Processor with Hazard Tracking Employing Register Range Compares 审中-公开
    具有危险追踪使用寄存器范围比较的处理器

    公开(公告)号:US20130173886A1

    公开(公告)日:2013-07-04

    申请号:US13343010

    申请日:2012-01-04

    IPC分类号: G06F9/30 G06F9/38

    CPC分类号: G06F9/3838

    摘要: Systems and methods for tracking data hazards in a processor. The processor comprises a pipelined architecture configured to execute a first instruction and a second instruction, wherein the second instruction is older than the first instruction. At least one of the first and second instructions comprises at least one operand expressed as a range of registers. Hazard detection logic is configured to compare the first instruction and the second instruction to determine if there is a data hazard, prior to expanding the second instruction.

    摘要翻译: 用于跟踪处理器中的数据危害的系统和方法。 处理器包括被配置为执行第一指令和第二指令的流水线架构,其中第二指令比第一指令早。 第一和第二指令中的至少一个指令包括表示为寄存器范围的至少一个操作数。 危险检测逻辑被配置为在扩展第二条指令之前比较第一条指令和第二条指令以确定是否存在数据危险。

    Processor with a Hybrid Instruction Queue with Instruction Elaboration Between Sections
    5.
    发明申请
    Processor with a Hybrid Instruction Queue with Instruction Elaboration Between Sections 审中-公开
    具有混合指令队列的处理器,具有部分之间的指令

    公开(公告)号:US20120204008A1

    公开(公告)日:2012-08-09

    申请号:US13363555

    申请日:2012-02-01

    IPC分类号: G06F9/312 G06F9/38

    摘要: Methods and apparatus for processing instructions by elaboration of instructions prior to issuing the instructions for execution are described. An instruction is received at a hybrid instruction queue comprised of a first queue and a second queue. When the second queue has available space, the instruction is elaborated to expand one or more bit fields to reduce decoding complexity when the elaborated instruction is issued, wherein the elaborated instruction is stored in the second queue. When the second queue does not have available space, the instruction is stored in an unelaborated form in a first queue. The first queue is configured as an exemplary in-order queue and the second queue is configured as an exemplary out-of-order queue.

    摘要翻译: 描述用于在发出用于执行的指令之前通过阐述指令来处理指令的方法和装置。 在由第一队列和第二队列组成的混合指令队列中接收指令。 当第二队列具有可用空间时,阐述指令以扩展一个或多个比特字段以在发布详细指令时降低解码复杂度,其中详细指令被存储在第二队列中。 当第二个队列没有可用空间时,该指令以未预定义的形式存储在第一个队列中。 第一队列被配置为示例性的按顺序队列,并且第二队列被配置为示例性的无序队列。

    Processor with a Coprocessor having Early Access to Not-Yet Issued Instructions
    6.
    发明申请
    Processor with a Coprocessor having Early Access to Not-Yet Issued Instructions 有权
    具有协处理器的处理器可以及早访问未发布的指令

    公开(公告)号:US20120204005A1

    公开(公告)日:2012-08-09

    申请号:US13363541

    申请日:2012-02-01

    IPC分类号: G06F9/312

    摘要: Apparatus and methods provide early access of instructions. A fetch queue is coupled to an instruction cache and configured to store a mix of processor instructions for a first processor and coprocessor instructions for a second processor. A coprocessor instruction selector is coupled to the fetch queue and configured to copy coprocessor instructions from the fetch queue. A queue is coupled to the coprocessor instruction selector and from which coprocessor instructions are accessed for execution before the coprocessor instruction is issued to the first processor. Execution of the copied coprocessor instruction is started in the coprocessor before the coprocessor instruction is issued to a processor. The execution of the copied coprocessor instruction is completed based on information received from the processor after the coprocessor instruction has been issued to the processor.

    摘要翻译: 装置和方法提供了指令的早期访问。 获取队列被耦合到指令高速缓存并且被配置为存储用于第一处理器的处理器指令和用于第二处理器的协处理器指令的混合。 协处理器指令选择器被耦合到获取队列并被配置为从提取队列中复制协处理器指令。 在协处理器指令被发送到第一处理器之前,队列被耦合到协处理器指令选择器并且从哪个协处理器指令被访问以执行。 在将协处理器指令发送到处理器之前,在协处理器中启动复制的协处理器指令的执行。 复制的协处理器指令的执行是在协处理器指令发出到处理器之后基于从处理器接收到的信息完成的。

    Mode-based multiply-add recoding for denormal operands
    7.
    发明授权
    Mode-based multiply-add recoding for denormal operands 失效
    基于模式的乘法加法重新编码用于反常操作数

    公开(公告)号:US08447800B2

    公开(公告)日:2013-05-21

    申请号:US13026335

    申请日:2011-02-14

    IPC分类号: G06F7/38

    摘要: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream. Upon inspection of the operands, if an unnormal intermediate result or a denormal final result will not occur, the addend may be restored to the multiply-add instruction and the add instruction converted to a NOP.

    摘要翻译: 在非正常支持模式中,浮点加法器的归一化电路用于对浮点乘法器的输出进行归一化或非归一化。 每个浮点乘法指令被推测转换为乘法加法指令,加数被强制为零。 这将保留产品的价值,同时使用浮点加法器的归一化电路对产品进行规范化或非规范化。 当乘法运算的操作数可用时,它们将被检查。 如果操作数不会产生非正常的中间产品或非正常的最终产品,则通过操作数转发来抑制添加操作。 此外,每个非融合浮点乘法指令被替换为具有零加法的加法指令,并且具有原始加法指令的加数的浮点加法指令被插入到指令流中。 在检查操作数时,如果不会发生非正常的中间结果或非正常的最终结果,则可以将加数恢复为乘法指令,并将加法指令转换为NOP。

    Floating-point processor with selectable subprecision
    8.
    发明授权
    Floating-point processor with selectable subprecision 有权
    具有可选择精度的浮点处理器

    公开(公告)号:US07725519B2

    公开(公告)日:2010-05-25

    申请号:US11244492

    申请日:2005-10-05

    IPC分类号: G06F7/483

    摘要: A floating-point processor with selectable subprecision includes a register configured to store a plurality of bits in a floating-point format, a controller, and a floating-point mathematical operator. The controller is configured to select a subprecision for a floating-point operation, in response to user input. The controller is configured to determine a subset of the bits, in accordance with the selected subprecision. The floating-point operator is configured to perform the floating-point operation using only the subset of the bits. Excess bits that are not used in the floating-point operation may be forced into a low-leakage state. The output value resulting from the floating-point operation is either truncated or rounded to the selected subprecision.

    摘要翻译: 具有可选择精度的浮点处理器包括被配置为以浮点格式存储多个位的寄存器,控制器和浮点数学运算符。 控制器被配置为响应于用户输入来选择用于浮点运算的子精度。 控制器被配置为根据所选择的精度来确定比特的子集。 浮点运算符被配置为仅使用位的子集执行浮点运算。 在浮点运算中未使用的多个位可能被迫进入低泄漏状态。 由浮点运算产生的输出值要么被截断,要么舍入到选定的子精度。

    Latency insensitive FIFO signaling protocol
    9.
    发明授权
    Latency insensitive FIFO signaling protocol 有权
    延迟不敏感的FIFO信令协议

    公开(公告)号:US07454538B2

    公开(公告)日:2008-11-18

    申请号:US11128135

    申请日:2005-05-11

    IPC分类号: G06F3/00

    摘要: Data from a source domain operating at a first data rate is transferred to a FIFO in another domain operating at a different data rate. The FIFO buffers data before transfer to a sink for further processing or storage. A source side counter tracks space available in the FIFO. In disclosed examples, the initial counter value corresponds to FIFO depth. The counter decrements in response to a data ready signal from the source domain, without delay. The counter increments in response to signaling from the sink domain of a read of data off the FIFO. Hence, incrementing is subject to the signaling latency between domains. The source may send one more beat of data when the counter indicates the FIFO is full. The last beat of data is continuously sent from the source until it is indicated that a FIFO position became available; effectively providing one more FIFO position.

    摘要翻译: 来自以第一数据速率运行的源域的数据被传送到以不同数据速率工作的另一个域中的FIFO。 FIFO在传输到宿之前缓冲数据以进一步处理或存储。 源端计数器跟踪FIFO中可用的空间。 在公开的示例中,初始计数器值对应于FIFO深度。 响应于来自源域的数据就绪信号,计数器无延迟地递减。 响应于来自接收器域的信令从FIFO读取数据,计数器递增。 因此,增量受到域之间的信令等待时间的限制。 当计数器指示FIFO已满时,源可能再发送一次数据。 数据的最后一次节拍从源头连续发送到指示FIFO位置可用为止; 有效提供一个FIFO位置。

    Software selectable adjustment of SIMD parallelism
    10.
    发明授权
    Software selectable adjustment of SIMD parallelism 有权
    软件可选择调整SIMD并行性

    公开(公告)号:US08799627B2

    公开(公告)日:2014-08-05

    申请号:US13350949

    申请日:2012-01-16

    IPC分类号: G06F9/00 G06F1/00

    摘要: Selective power control of one or more processing elements matches a degree of parallelism to requirements of a task performed in a highly parallel programmable data processor. For example, when program operations require less than the full width of the data path, a software instruction of the program sets a mode of operation requiring a subset of the parallel processing capacity. At least one parallel processing element, that is not needed, can be shut down to conserve power. At a later time, when the added capacity is needed, execution of another software instruction sets the mode of operation to that of the wider data path, typically the full width, and the mode change reactivates the previously shut-down processing element.

    摘要翻译: 一个或多个处理元件的选择性功率控制与高度并行的可编程数据处理器中执行的任务的要求相匹配。 例如,当程序操作需要小于数据路径的全宽时,该程序的软件指令设置需要并行处理能力子集的操作模式。 可以关闭至少一个不需要的并行处理元件以节省功率。 稍后,当需要添加容量时,另一软件指令的执行将操作模式设置为较宽数据路径的操作模式,通常为全宽,并且模式更改重新激活先前关闭的处理元素。