Register renaming to reduce bypass and increase apparent physical register size
    1.
    发明授权
    Register renaming to reduce bypass and increase apparent physical register size 失效
    注册重命名以减少旁路并增加明显的物理寄存器大小

    公开(公告)号:US06944751B2

    公开(公告)日:2005-09-13

    申请号:US10074098

    申请日:2002-02-11

    IPC分类号: G06F9/30 G06F9/38 G06F9/312

    摘要: The invention provides a processor architecture that bypasses data hazards. The architecture has an array of pipelines and a register file. Each of the pipelines includes an array of execution units. The register file has a first section of n registers (e.g., 128 registers) and a second section of m registers (e.g., 16 registers). A write mux couples speculative data from the execution units to the second set of m registers and non-speculative data from a write-back stage of the execution units to the first section of n registers. A read mux couples the speculative data from the second set of m registers to the execution units to bypass data hazards within the execution units. The register file preferably includes column decode logic for each of the registers in the second section of m registers to architect speculative data without moving data. The decode logic first decodes, and then selects, an age of the producer of the speculative state; the newest producer enables the decode.

    摘要翻译: 本发明提供了绕过数据危害的处理器架构。 该架构具有一系列管道和一个寄存器文件。 每个管道都包括执行单元的数组。 寄存器文件具有n个寄存器(例如,128个寄存器)的第一部分和m个寄存器的第二部分(例如,16个寄存器)。 写入多路复用器将来自执行单元的推测数据从执行单元的写回阶段到n个寄存器的第一部分将来自执行单元的推测数据耦合到第二组m个寄存器和非推测数据。 读取多路复用器将来自第二组m个寄存器的推测数据耦合到执行单元以绕过执行单元内的数据危险。 寄存器文件优选地包括用于m个寄存器的第二部分中的每个寄存器的列解码逻辑,以构建不移动数据的推测数据。 解码逻辑首先解码,然后选择投机状态的生产者的年龄; 最新的制作商可以进行解码。

    Apparatus and method for shift register rate control of microprocessor instruction prefetches
    3.
    发明授权
    Apparatus and method for shift register rate control of microprocessor instruction prefetches 有权
    微处理器指令预取的移位寄存器速率控制的装置和方法

    公开(公告)号:US06647487B1

    公开(公告)日:2003-11-11

    申请号:US09506972

    申请日:2000-02-18

    IPC分类号: G06F930

    CPC分类号: G06F9/3802 G06F12/0862

    摘要: An apparatus and methods for optimizing prefetch performance. Logical ones are shifted into the bits of a shift register from the left for each instruction address prefetched. As instruction addresses are fetched by the processor, logical zeros are shifted into the bit positions of the shift register from the right. Once initiated, prefetching continues until a logical one is stored in the nth-bit of the shift register. Detection of this logical one in the n-th bit causes prefetching to cease until a prefetched instruction address is removed from the prefetched instruction address register and a logical zero is shifted back into the n-th bit of the shift register. Thus, autonomous prefetch agents are prevented from prefetching too far ahead of the current instruction pointer resulting in wasted memory bandwidth and the replacement of useful instruction in the instruction cache.

    摘要翻译: 一种用于优化预取性能的设备和方法。 对于预取每个指令地址,逻辑字符从左侧移位到移位寄存器的位。 由于指令地址由处理器取出,所以逻辑零从右侧移入移位寄存器的位位置。 一旦启动,预取继续,直到逻辑1存储在移位寄存器的第n位。 在第n位的这个逻辑1的检测导致预取停止,直到预取指令地址从预取指令地址寄存器中移除并且逻辑0被移回到移位寄存器的第n位。 因此,防止自主预取代理在当前指令指针之前超前预取,导致浪费的存储器带宽和指令高速缓存中的有用指令的替换。

    Opportunistic use of pre-corrected data to improve processor performance
    4.
    发明授权
    Opportunistic use of pre-corrected data to improve processor performance 失效
    机会性地使用预校正的数据来提高处理器性能

    公开(公告)号:US5961655A

    公开(公告)日:1999-10-05

    申请号:US759583

    申请日:1996-12-05

    IPC分类号: G06F11/10 G06F11/00

    CPC分类号: G06F11/1052

    摘要: Disclosed herein are methods and apparatus which provide a processor with raw, uncorrected data. The uncorrected data (or pre-corrected data) is retrieved from memory and then "bypassed" to a processing unit before its error status is known. Concurrently, error correction hardware determines the data's error status. Since the correct/incorrect indication is the first result available from error correction hardware, this result may be used to gate the actions of a processing unit prior to its taking an irrevocable action with possibly incorrect data. If bypassed data is incorrect, processing unit control logic may flag it as such and read corrected data from the output of error correction hardware. If bypassed data is correct (as will usually be the case), bypassed data may be consumed by a processing unit in due course.

    摘要翻译: 这里公开了为处理器提供原始未校正数据的方法和装置。 从存储器检索未校正的数据(或预校正的数据),然后在知道其错误状态之前“旁路”到处理单元。 同时,纠错硬件确定数据的错误状态。 由于正确/不正确的指示是可从纠错硬件获得的第一个结果,所以该结果可以用于在处理单元采取不可撤销的动作(可能不正确的数据)之前对其进行选通。 如果旁路数据不正确,则处理单元控制逻辑可以标记它,并从纠错硬件的输出读取校正数据。 如果旁路数据是正确的(通常情况下),旁路数据可能在适当的时候被处理单元消耗。

    Associative cache memory with improved hit time
    5.
    发明授权
    Associative cache memory with improved hit time 失效
    关联高速缓存,具有改进的命中时间

    公开(公告)号:US5860097A

    公开(公告)日:1999-01-12

    申请号:US717786

    申请日:1996-09-23

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0864

    摘要: An associative cache memory for a computer with improved cache hit times. All possible data items are presented to bus driver circuits, thereby deferring data selection as long as possible. Driving and multiplexing are combined. The output of tag comparison directly selects at most one set of driver circuits. As a result, the only processing time in series with tag comparison is driver circuit selection. Since the data selection delay in series with tag comparison delay is reduced, the time delay is reduced for a clock edge for data driving after tag comparison, thereby enabling a faster clock.

    摘要翻译: 用于具有改进的缓存命中次数的计算机的关联缓存存储器。 所有可能的数据项都提供给总线驱动器电路,从而尽可能延长数据选择。 驱动和复用结合在一起。 标签比较的输出直接选择最多一组驱动电路。 结果,与标签比较串联的唯一处理时间是驱动电路选择。 由于与标签比较延迟串联的数据选择延迟减小,因此在标签比较之后的数据驱动的时钟边沿的时间延迟减小,从而实现更快的时钟。

    Integrated debug trigger method and apparatus for an integrated circuit
    6.
    发明授权
    Integrated debug trigger method and apparatus for an integrated circuit 失效
    用于集成电路的集成调试触发方法和装置

    公开(公告)号:US5751735A

    公开(公告)日:1998-05-12

    申请号:US749188

    申请日:1996-11-14

    IPC分类号: G06F11/36 G06F11/27

    CPC分类号: G06F11/27

    摘要: Presented is an internal integrated debug trigger apparatus for use in debugging functional and electrical failures of an integrated circuit chip. The debug trigger apparatus includes a plurality of software programmable trigger registers and a plurality of software programmable trigger function blocks. Each trigger register monitors a plurality of integrated circuit signals which may include signals sent to the external pins of the integrated circuit and signals present internal to the chip. If the value of the monitored signals matches the programmed trigger condition, the trigger register produces a trigger match signal. Each trigger function block receives a combination of the trigger match signals generated by the trigger registers and each computes its programmed boolean minterm function on its inputs. Each trigger function block produces a trigger capture signal which may be true or false according to the computed function of the inputs. The debug trigger may also include a programmable iteration counter which allows for repetition of a trigger condition before it is provided external to the chip. The output of the iteration counter may be connected as input to one or more of the other trigger registers to allow for different start and end conditions.

    摘要翻译: 提出了一种内部集成调试触发装置,用于调试集成电路芯片的功能和电气故障。 调试触发装置包括多个软件可编程触发寄存器和多个软件可编程触发功能块。 每个触发寄存器监视多个集成电路信号,其可以包括发送到集成电路的外部引脚的信号和芯片内部存在的信号。 如果监视信号的值与编程的触发条件相匹配,则触发寄存器产生触发匹配信号。 每个触发功能块接收由触发寄存器产生的触发匹配信号的组合,并且每个触发功能块在其输入端上计算其编程的布尔最小功能。 每个触发功能块根据输入的计算功能产生触发捕获信号,该信号可能为真或假。 调试触发器还可以包括可编程迭代计数器,其允许在芯片外部提供触发条件之前重复触发条件。 迭代计数器的输出可以作为输入连接到一个或多个其他触发寄存器,以允许不同的开始和结束条件。

    Method and apparatus for fetching instructions from the memory subsystem of a mixed architecture processor into a hardware emulation engine
    7.
    发明授权
    Method and apparatus for fetching instructions from the memory subsystem of a mixed architecture processor into a hardware emulation engine 失效
    将指令从混合架构处理器的存储器子系统中取出到硬件仿真引擎中的方法和装置

    公开(公告)号:US07356674B2

    公开(公告)日:2008-04-08

    申请号:US10717671

    申请日:2003-11-21

    IPC分类号: G06F15/00

    摘要: A method of, and apparatus for, interfacing the hardware of a processor capable of processing instructions from more than one type of instruction set. More particularly, an engine responsible for fetching native instructions from a memory subsystem (such as an EM fetch engine) is interfaced with an engine that processes emulated instructions (such as an x86 engine). This is achieved using a handshake protocol, whereby the x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions for subsequent decode and execution. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time. If a pending fetch request is canceled due to a pipeline flush, then the fetch address queue is cleared and the pending fetch requests are canceled. The system also prevents macroinstruction (MIQ)-related stalls by using a speculative write pointer to control the issuance of fetch requests, thereby preventing the MIQ from becoming oversubscribed.

    摘要翻译: 用于对能够处理来自多于一种类型的指令集的指令的处理器的硬件进行接口的方法和装置。 更具体地,负责从存储器子系统(例如EM获取引擎)获取本地指令的引擎与处理仿真指令(例如x86引擎)的引擎接口。 这是使用握手协议来实现的,由此x86引擎将发送显式提取请求信号连同提取地址一起发送到EM提取引擎。 然后,EM提取引擎访问存储器子系统并检索用于后续解码和执行的指令行。 EM提取引擎将这行指令发送到x86引擎以及显式提取完成信号。 EM提取引擎还包括一个获取地址队列,能够在抓取地址被EM提取引擎处理之前保存它们。 处理提取请求,使得多个提取请求可能在同一时间挂起。 如果由于流水线刷新而取消挂起的提取请求,则取消地址队列将被清除,并取消挂起的提取请求。 该系统还通过使用推测性写入指针来控制发送取出请求来防止宏指令(MIQ)相关的失速,从而防止MIQ变得超额认购。

    Method and apparatus for fetching instructions from the memory subsystem of a mixed architecture processor into a hardware emulation engine
    8.
    发明授权
    Method and apparatus for fetching instructions from the memory subsystem of a mixed architecture processor into a hardware emulation engine 有权
    将指令从混合架构处理器的存储器子系统中取出到硬件仿真引擎中的方法和装置

    公开(公告)号:US06678817B1

    公开(公告)日:2004-01-13

    申请号:US09510010

    申请日:2000-02-22

    IPC分类号: G06F1500

    摘要: A method of, and apparatus for, interfacing the hardware of a processor capable of processing instructions from more than one type of instruction set. More particularly, an engine responsible for fetching native instructions from a memory subsystem (such as an EM fetch engine) is interfaced with an engine that processes emulated instructions (such as an x86 engine). This is achieved using a handshake protocol, whereby the x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions for subsequent decode and execution. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time. If a pending fetch request is canceled due to a pipeline flush, then the fetch address queue is cleared and the pending fetch requests are canceled. The system also prevents macroinstruction (MIQ)-related stalls by using a speculative write pointer to control the issuance of fetch requests, thereby preventing the MIQ from becoming oversubscribed.

    摘要翻译: 用于对能够处理来自多于一种类型的指令集的指令的处理器的硬件进行接口的方法和装置。 更具体地,负责从存储器子系统(例如EM获取引擎)获取本地指令的引擎与处理仿真指令(例如x86引擎)的引擎接口。 这是使用握手协议实现的,由此x86引擎将发送显式提取请求信号连同提取地址一起发送到EM提取引擎。 然后,EM提取引擎访问存储器子系统并检索用于后续解码和执行的指令行。 EM提取引擎将这行指令发送到x86引擎以及显式提取完成信号。 EM提取引擎还包括一个获取地址队列,能够在抓取地址被EM提取引擎处理之前保存它们。 处理提取请求,使得多个提取请求可能在同一时间挂起。 如果由于流水线刷新而取消挂起的提取请求,则取消地址队列将被清除,并取消挂起的提取请求。 该系统还通过使用推测性写入指针来控制发送取出请求来防止宏指令(MIQ)相关的失速,从而防止MIQ变得超额认购。

    Mechanism for broadside reads of CAM structures
    9.
    发明授权
    Mechanism for broadside reads of CAM structures 失效
    CAM结构的宽边读取机制

    公开(公告)号:US06493792B1

    公开(公告)日:2002-12-10

    申请号:US09495155

    申请日:2000-01-31

    IPC分类号: G11C1500

    CPC分类号: G11C15/00 G06F12/1027

    摘要: A CAM providing for the identification of a plurality of multiple bit tag values stored in the CAM, having logic circuitry for comparing each bit of an inputted test value to the corresponding bits of all stored tag values. A bit select is employed for generating a plurality of test bits for sequential input into the logic circuitry. The logic circuitry compares the plurality of test bits to the corresponding bit of each stored tag value and generates a “hit” signal if the selected bit is the same as the corresponding bit of the stored tag value. Storage means are employed for recording the results of the compare with the M hit signal.

    摘要翻译: 提供用于识别存储在CAM中的多个多个位标签值的CAM,其具有用于将输入的测试值的每个比特与所有存储的标签值的对应比特进行比较的逻辑电路。 采用位选择来产生用于顺序输入到逻辑电路中的多个测试位。 逻辑电路将多个测试位与每个存储的标签值的相应位进行比较,如果所选择的位与所存储的标签值的相应位相同,则产生“命中”信号。 采用存储手段来记录与M命中信号进行比较的结果。

    Method and apparatus for implementing two architectures in a chip
    10.
    发明授权
    Method and apparatus for implementing two architectures in a chip 失效
    用于在芯片中实现两种架构的方法和装置

    公开(公告)号:US07343479B2

    公开(公告)日:2008-03-11

    申请号:US10602916

    申请日:2003-06-25

    IPC分类号: G06F9/455

    摘要: The present invention is a method for implementing two architectures on a single chip. The method uses a fetch engine to retrieve instructions. If the instructions are macroinstructions, then it decodes the macroinstructions into microinstructions, and then bundles those microinstructions using a bundler, within an emulation engine. The bundles are issued in parallel and dispatched to the execution engine and contain pre-decode bits so that the execution engine treats them as microinstructions. Before being transferred to the execution engine, the instructions may be held in a buffer. The method also selects between bundled microinstructions from the emulation engine and native microinstructions coming directly from the fetch engine, by using a multiplexer or other means. Both native microinstructions and bundled microinstructions may be held in the buffer. The method also sends additional information to the execution engine.

    摘要翻译: 本发明是用于在单个芯片上实现两个架构的方法。 该方法使用提取引擎来检索指令。 如果指令是宏指令,则将宏指令解码为微指令,然后在仿真引擎内使用捆绑器捆绑这些微指令。 捆绑包并行发布并分发到执行引擎并包含预解码位,以便执行引擎将它们视为微指令。 在被传送到执行引擎之前,可以将指令保存在缓冲器中。 该方法还可以通过使用多路复用器或其他方式,从模拟引擎的捆绑微指令和直接从获取引擎进行的本机微指令之间进行选择。 本地微指令和捆绑的微指令都可以保存在缓冲区中。 该方法还向执行引擎发送附加信息。