System and method for providing asynchronous dynamic millicode entry prediction
    1.
    发明授权
    System and method for providing asynchronous dynamic millicode entry prediction 失效
    提供异步动态millicode条目预测的系统和方法

    公开(公告)号:US07913068B2

    公开(公告)日:2011-03-22

    申请号:US12035109

    申请日:2008-02-21

    IPC分类号: G06F9/42

    摘要: A system and method for asynchronous dynamic millicode entry prediction in a processor are provided. The system includes a branch target buffer (BTB) to hold branch information. The branch information includes: a branch type indicating that the branch represents a millicode entry (mcentry) instruction targeting a millicode subroutine, and an instruction length code (ILC) associated with the mcentry instruction. The system also includes search logic to perform a method. The method includes locating a branch address in the BTB for the mcentry instruction targeting the millicode subroutine, and determining a return address to return from the millicode subroutine as a function of the an instruction address of the mcentry instruction and the ILC. The system further includes instruction fetch controls to fetch instructions of the millicode subroutine asynchronous to the search logic. The search logic may also operate asynchronous with respect to an instruction decode unit.

    摘要翻译: 提供了一种用于在处理器中进行异步动态毫代码入口预测的系统和方法。 该系统包括用于保存分支信息的分支目标缓冲器(BTB)。 分支信息包括:分支类型,指示分支表示针对毫微米子例程的毫米条目(中心)指令,以及与中心指令相关联的指令长度代码(ILC)。 该系统还包括执行方法的搜索逻辑。 该方法包括将BTB中的分支地址定位为针对毫秒子程序的中心指令,以及根据中心指令和ILC的指令地址确定返回地址以从millicode子程序返回。 该系统还包括指令获取控制以获取与搜索逻辑异步的毫微数子程序的指令。 搜索逻辑也可以相对于指令解码单元进行异步操作。

    Reduced overhead address mode change management in a pipelined, recycling microprocessor
    2.
    发明授权
    Reduced overhead address mode change management in a pipelined, recycling microprocessor 失效
    在流水线回收微处理器中减少开销地址模式更改管理

    公开(公告)号:US07971034B2

    公开(公告)日:2011-06-28

    申请号:US12051415

    申请日:2008-03-19

    IPC分类号: G06F9/00 G06F9/44 G06F9/30

    摘要: A method, system, and computer program product for reduced overhead address mode change management in a pipelined, recycling microprocessor are provided. The recycling microprocessor includes logic executing thereon. The microprocessor also includes an instruction fetch unit (IFU) supporting computation of address adds in selected address modes and reporting non-equal comparison of the computation to the logic. The microprocessor further includes a fixed point unit determining whether the mode has changed and reporting changes to the logic. Upon determining the comparison yields an equal result but the mode has changed, a recycle event is triggered to ensure subsequent ofetches are relaunched in the correct mode and that no execution writebacks occur from work performed in an incorrect mode. For comparisons yielding a non-equal result and a changed mode, the logic clears bits set in response to the determinations, and a serialization event is taken to reset a corresponding pipeline for operation in the correct mode.

    摘要翻译: 提供了一种用于在流水线式回收微处理器中减少开销地址模式改变管理的方法,系统和计算机程序产品。 回收微处理器包括在其上执行的逻辑。 微处理器还包括一个指令提取单元(IFU),用于支持在所选择的地址模式中添加的地址的计算,并且将该计算的不相等的比较报告给逻辑。 微处理器还包括确定模式是否改变并且对逻辑进行报告改变的固定点单元。 在确定比较时,产生相等的结果,但是模式已经改变,触发一个循环事件,以确保以正确的模式重新启动后续的提取,并且在不正确的模式下执行的工作不会执行执行回写。 为了比较产生不相等的结果和改变的模式,逻辑清除响应于确定的位设置,并且采用串行化事件来复位相应的流水线以便以正确的模式进行操作。

    Operand and result forwarding between differently sized operands in a superscalar processor
    3.
    发明授权
    Operand and result forwarding between differently sized operands in a superscalar processor 失效
    操作数和结果在超标量处理器中的不同大小的操作数之间转发

    公开(公告)号:US07921279B2

    公开(公告)日:2011-04-05

    申请号:US12051792

    申请日:2008-03-19

    IPC分类号: G06F9/30

    摘要: Result and operand forwarding is provided between differently sized operands in a superscalar processor by grouping a first set of instructions for operand forwarding, and grouping a second set of instructions for result forwarding, the first set of instructions comprising a first source instruction having a first operand and a first dependent instruction having a second operand, the first dependent instruction depending from the first source instruction; the second set of instructions comprising a second source instruction having a third operand and a second dependent instruction having a fourth operand, the second dependent instruction depending from the second source instruction, performing operand forwarding by forwarding the first operand, either whole or in part, as it is being read to the first dependent instruction prior to execution; performing result forwarding by forwarding a result of the second source instruction, either whole or in part, to the second dependent instruction, after execution; wherein the operand forwarding is performed by executing the first source instruction together with the first dependent instruction; and wherein the result forwarding is performed by executing the second source instruction together with the second dependent instruction.

    摘要翻译: 通过对用于操作数转发的第一组指令进行分组,以及对用于结果转发的第二组指令进行分组,在超标量处理器中的不同大小的操作数之间提供结果和操作数转发,所述第一组指令包括具有第一操作数的第一源指令 以及具有第二操作数的第一依赖指令,所述第一依赖指令取决于所述第一源指令; 所述第二组指令包括具有第三操作数和第二从属指令的第二源指令,所述第三操作数和第二从属指令具有第四操作数,所述第二依赖指令取决于所述第二源指令,通过转发所述第一操作数全部或部分地执行操作数转发, 因为它在执行之前被读取到第一个依赖指令; 执行结果转发,将第二源指令的结果全部或部分转发到第二依赖指令; 其中通过与第一依赖指令一起执行第一源指令来执行操作数转发; 并且其中通过与第二从属指令一起执行第二源指令来执行结果转发。

    Method and system for overlapping execution of instructions through non-uniform execution pipelines in an in-order processor
    4.
    发明授权
    Method and system for overlapping execution of instructions through non-uniform execution pipelines in an in-order processor 失效
    用于通过按顺序处理器中的非均匀执行流水线重复执行指令的方法和系统

    公开(公告)号:US07913067B2

    公开(公告)日:2011-03-22

    申请号:US12034084

    申请日:2008-02-20

    IPC分类号: G06F9/38

    摘要: A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction.

    摘要翻译: 提供了一种用于通过在顺序处理器中的非均匀执行管线来重复执行(OE)指令的系统和方法。 该系统包括在第一执行流水线中执行指令执行的第一执行单元。 该系统还包括第二执行单元,用于在第二执行流水线中执行指令执行,其中第二执行流水线包括比第一执行流水线更多的级数。 该系统还包括一个指令调度单元(IDU),该IDU包括OE寄存器和用于向第一执行单元分配一个OE能力指令的逻辑,使得指令在完成先前发送的指令执行之前完成执行 单元。 该系统还包括一个锁存器,用于保持执行OE能力指令的结果,直到第二执行单元完成先前发送的指令的执行。

    Method, computer program product, and hardware product for eliminating or reducing operand line crossing penalty
    5.
    发明授权
    Method, computer program product, and hardware product for eliminating or reducing operand line crossing penalty 有权
    方法,计算机程序产品和硬件产品,用于消除或减少操作线越界处罚

    公开(公告)号:US09201655B2

    公开(公告)日:2015-12-01

    申请号:US12051296

    申请日:2008-03-19

    IPC分类号: G06F13/00 G06F13/28 G06F9/38

    CPC分类号: G06F9/3824

    摘要: Eliminating or reducing an operand line crossing penalty by performing an initial fetch for an operand from a data cache of a processor. The initial fetch is performed by allowing or permitting the initial fetch to occur unaligned with reference to a quadword boundary. A plurality of subsequent fetches for a corresponding plurality of operands from the data cache are performed wherein each of the plurality of subsequent fetches is aligned to any of a plurality of quadword boundaries to prevent each of a plurality of individual fetch requests from spanning a plurality of lines in the data cache. A steady stream of data is maintained by placing an operand buffer at an output of the data cache to store and merge data from the initial fetch and the plurality of subsequent fetches, and to return the stored and merged data to the processor.

    摘要翻译: 通过从处理器的数据高速缓存中执行对操作数的初始提取来消除或减少操作数线路交叉处罚。 初始提取是通过允许或允许以参考四字边界未对齐的方式发生初始提取而执行的。 执行用于来自数据高速缓存的相应多个操作数的多个后续提取,其中多个后续提取中的每一个与多个四字边界中的任何一个对齐,以防止多个单独提取请求中的每一个跨越多个 数据缓存中的行。 通过在数据高速缓存的输出处放置操作数缓冲器来存储和合并来自初始提取和多个后续提取的数据并将存储和合并的数据返回到处理器来维持稳定的数据流。

    Recycling long multi-operand instructions
    6.
    发明授权
    Recycling long multi-operand instructions 失效
    回收长操作数指令

    公开(公告)号:US07962726B2

    公开(公告)日:2011-06-14

    申请号:US12051215

    申请日:2008-03-19

    IPC分类号: G06F9/00 G06F9/44

    CPC分类号: G06F9/30065 G06F9/3832

    摘要: A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit.

    摘要翻译: 公开了配置用于长操作数指令的流水线微处理器。 微处理器包括存储单元和加载存储单元。 加载存储单元耦合到存储器单元,并且包括从存储器单元接收信息并包括操作数选择器和移位寄存器部分的数据格式化器。 微处理器还包括耦合到加载存储单元并从其接收操作数信息的执行单元。 执行单元包括耦合到执行单元内的存储位置的输出锁存器,用于存储来自执行单元的输出信息。

    Decimal multiplication for superscaler processors
    7.
    发明授权
    Decimal multiplication for superscaler processors 失效
    超标量处理器的十进制乘法

    公开(公告)号:US07412476B2

    公开(公告)日:2008-08-12

    申请号:US11460296

    申请日:2006-07-27

    IPC分类号: G06F7/523

    CPC分类号: G06F9/3001 G06F7/496

    摘要: A method for decimal multiplication in a superscaler processor comprising: obtaining a first operand and a second operand; establishing a multiplier and an effective multiplicand from the first operand and the second operand; and generating and accumulating a partial product term every two cycles. The partial product terms are created from the effective multiplicand and multiples of the multiplier, where the effective multiplicand is stored in a first register file, the multiples being ones times the effective multiplier, two times the effective multiplier, four times the effective multiplier and eight times the effective multiplier and the partial product terms are added to an accumulation of previous partial product terms shifted one digit right such that a digit shifted off is preserved as a result digit.

    摘要翻译: 一种用于在超标量处理器中进行十进制相乘的方法,包括:获得第一操作数和第二操作数; 从第一个操作数和第二个操作数建立乘数和有效的被乘数; 并且每两个周期产生和累积部分乘积项。 部分乘积项是从乘法器的有效乘数和乘数创建的,其中有效被乘数存储在第一个寄存器文件中,倍数是有效乘数的倍数,有效乘数的两倍,有效乘数的四倍和八倍 乘以有效乘数和部分乘积项添加到前一个部分乘积项的累积中,该乘积项被移位一位数字,使得数字移位被保留为结果位。

    Exercise device
    8.
    发明授权
    Exercise device 失效
    运动装置

    公开(公告)号:US6095952A

    公开(公告)日:2000-08-01

    申请号:US311762

    申请日:1999-05-13

    摘要: The exercise device has a horizontal section for walking/hiking purposes as well as a climbing section to simulate a mountain climbing exercise. The climbing section has an endless belt of interconnected panels, each of which is provided with a plurality of blocks to provide hand/foot holds. The blocks are movable from a retracted position to extended positions by electromagnets which are programmed to effect actuation of one or more of the blocks in a predetermined pattern. The blocks are retracted at the end of the curvilinear run of the endless belt to avoid obstruction with the continued movement of the endless belt.

    摘要翻译: 锻炼装置具有用于步行/徒步目的的水平部分以及模拟登山运动的攀登部分。 攀爬部分具有互连面板的环形带,每个带有多个块以提供手/脚架。 这些块通过电磁铁从缩回位置移动到延伸位置,该电磁体被编程为以预定模式实现一个或多个块的致动。 在环形带的曲线运行结束时,块被缩回以避免环形带的继续移动而阻塞。

    Modular binary multiplier for signed and unsigned operands of variable widths
    9.
    发明授权
    Modular binary multiplier for signed and unsigned operands of variable widths 有权
    具有可变宽度的有符号和无符号操作数的模块二进制乘法器

    公开(公告)号:US07490121B2

    公开(公告)日:2009-02-10

    申请号:US11749239

    申请日:2007-05-16

    IPC分类号: G06F7/52

    摘要: A method of implementing binary multiplication in a processing device includes obtaining a multiplicand and a multiplier from a storage device; in the event the multiplier is larger than a selected length, partitioning the multiplier into a plurality of multiplier subgroups; in the event the multiplicand is larger than a selected length, partitioning the multiplicand into a plurality of multiplicand subgroups and at least one of zeroing out of unused bits of the multiplicand subgroup and sign-extending a smaller portion of the multiplicand subgroup; establishing a plurality of multiplicand multiples based on at least one of a selected multiplicand subgroup of the plurality of multiplicand subgroups and the multiplicand; selecting one or more of the multiplicand multiples of the plurality of multiplicand multiples based on the each multiplier subgroup of the plurality of multiplier subgroups; and generating a first modular product based on the selected multiplicand multiples.

    摘要翻译: 在处理设备中实现二进制乘法的方法包括从存储设备获取乘法器和乘法器; 在乘数大于选定长度的情况下,将乘法器分成多个乘法器子组; 在所述被乘数大于所选择的长度的情况下,将所述被乘数划分为多个被乘数的子组和被乘数子组的未使用的比特中的至少一个,并对被乘数子组的较小部分进行符号扩展; 基于所述多个被乘数子组和被乘数中的所选择的被乘数子群中的至少一个,建立多个被乘数; 基于所述多个乘法器子组中的每个乘法器子组来选择所述多个被乘数中的一个或多个被乘数; 以及基于所选择的被乘数生成第一模块化产品。

    Reducing operand store compare penalties

    公开(公告)号:US09626189B2

    公开(公告)日:2017-04-18

    申请号:US13524356

    申请日:2012-06-15

    IPC分类号: G06F9/38 G06F9/30

    摘要: Embodiments relate to reducing operand store compare penalties by detecting potential unit of operation (UOP) dependencies. An aspect includes a computer system for reducing operation store compare penalties. The system includes memory and a processor. The system performs a method including cracking an instruction into units of operation, where each UOP includes instruction text and address determination fields. The method includes identifying a load UOP among the plurality of UOPs and comparing values of the address determination fields of the load UOP with values of address determination fields of one or more previously-decoded store UOPs. The method also includes forcing, prior to issuance of the instruction to an execution unit, a dependency between the load UOP and the one or more previously-decoded store UOPs based on the comparing.