System and method for predictive early allocation of stores in a microprocessor
    61.
    发明授权
    System and method for predictive early allocation of stores in a microprocessor 失效
    用于在微处理器中预先提前分配商店的系统和方法

    公开(公告)号:US07600099B2

    公开(公告)日:2009-10-06

    申请号:US11683843

    申请日:2007-03-08

    IPC分类号: G06F9/00

    摘要: A system and method for predictive early allocation of stores in a microprocessor is presented. During instruction dispatch, an instruction dispatch unit retrieves an instruction from an instruction cache (Icache). When the retrieved instruction is an interruptible instruction, the instruction dispatch unit loads the interruptible instruction's instruction tag (IITAG) into an interruptible instruction tag register. A load store unit loads subsequent instruction information (instruction tag and store data) along with the interruptible instruction tag in a store data queue entry. Comparison logic receives a completing instruction tag from completion logic, and compares the completing instruction tag with the interruptible instruction tags included in the store data queue entries. In turn, deallocation logic deallocates those store data queue entries that include an interruptible instruction tag that matches the completing instruction tag.

    摘要翻译: 提出了一种用于在微处理器中预先提前存储分配的系统和方法。 在指令调度期间,指令调度单元从指令高速缓存(Icache)检索指令。 当检索到的指令是可中断指令时,指令调度单元将可中断指令的指令标记(IITAG)加载到可中断指令标记寄存器中。 加载存储单元将后续指令信息(指令标签和存储数据)与可中断指令标签一起存储在存储数据队列条目中。 比较逻辑从完成逻辑接收完成指令标记,并将完成指令标签与包含在存储数据队列条目中的可中断指令标签进行比较。 反过来,解配分配逻辑会释放那些包含与完成指令标记匹配的可中断指令标签的存储数据队列条目。

    Data shifting through scan registers
    62.
    发明授权
    Data shifting through scan registers 有权
    数据通过扫描寄存器进行移位

    公开(公告)号:US07551475B2

    公开(公告)日:2009-06-23

    申请号:US11278439

    申请日:2006-04-03

    IPC分类号: G11C11/00 G11C19/00

    CPC分类号: G01R31/318541

    摘要: A circuit permits a user to present signals to control the flow of data from a first-type cell to a second-type cell. The circuit is susceptible to loading each cell individually, as well as loading cells by means of scanning input in a series through a low order cell to a higher order cell. The circuit may be copied as a series of cells wherein a bit held in each first-type cell is copied to the next higher second-type cell.

    摘要翻译: 电路允许用户呈现信号以控制从第一型电池到第二型电池的数据流。 电路容易单独加载每个单元,以及通过将低阶单元的串行扫描输入到高阶单元来加载单元。 电路可以被复制为一系列单元,其中保持在每个第一类型单元中的位复制到下一较高的第二类型单元。

    STORE STREAM PREFETCHING IN A MICROPROCESSOR
    63.
    发明申请
    STORE STREAM PREFETCHING IN A MICROPROCESSOR 失效
    微处理器中的STORE STREAM PREFETCHING

    公开(公告)号:US20090070556A1

    公开(公告)日:2009-03-12

    申请号:US11969677

    申请日:2008-01-04

    IPC分类号: G06F9/38

    摘要: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.

    摘要翻译: 在具有加载/存储单元和预取硬件的微处理器中,预取硬件包括预取队列,其包含指示分配的数据流的条目。 预取引擎接收与由加载/存储单元执行的存储指令相关联的地址。 预取引擎通过将队列中的条目与包含多个高速缓存块的地址的窗口进行比较来确定是否对与存储指令相对应的预取队列中的条目进行分配,其中地址窗口从接收到的地址导出。 预取引擎将预取队列中的条目与2M个连续高速缓存块的窗口进行比较。 当预取队列中的任何条目都在地址窗口内时,预取引擎抑制新条目的分配。 当存储指令的数据地址等于地址窗口的边界区域中的地址时,预取引擎进一步抑制新条目的分配。

    Branch encoding before instruction cache write
    64.
    发明授权
    Branch encoding before instruction cache write 有权
    指令缓存写入前的分支编码

    公开(公告)号:US07487334B2

    公开(公告)日:2009-02-03

    申请号:US11050350

    申请日:2005-02-03

    IPC分类号: G06F9/34

    CPC分类号: G06F9/322 G06F9/382

    摘要: Method, system and computer program product for determining the targets of branches in a data processing system. A method for determining the target of a branch in a data processing system includes performing at least one pre-calculation relating to determining the target of the branch prior to writing the branch into a Level 1 (L1) cache to provide a pre-decoded branch, and then writing the pre-decoded branch into the L1 cache. By pre-calculating matters relating to the targets of branches before the branches are written into the L1 cache, for example, by re-encoding relative branches as absolute branches, a reduction in branch redirect delay can be achieved, thus providing a substantial improvement in overall processor performance.

    摘要翻译: 用于确定数据处理系统中分支目标的方法,系统和计算机程序产品。 一种用于确定数据处理系统中的分支的目标的方法包括在将分支写入级别1(L1)高速缓存之前执行与确定分支的目标有关的至少一个预计算,以提供预解码分支 ,然后将预解码的分支写入L1高速缓存。 通过在将分支写入L1高速缓存之前预先计算与分支目标相关的事项,例如通过将相关分支重新编码为绝对分支,可以实现分支重定向延迟的减少,从而提供了显着的改进 整体处理器性能。

    Multi-Mode Register Rename Mechanism for a Highly Threaded Simultaneous Multi-Threaded Microprocessor
    65.
    发明申请
    Multi-Mode Register Rename Mechanism for a Highly Threaded Simultaneous Multi-Threaded Microprocessor 有权
    多线程同时多线程微处理器的多模式寄存器重命名机制

    公开(公告)号:US20080250226A1

    公开(公告)日:2008-10-09

    申请号:US11696363

    申请日:2007-04-04

    IPC分类号: G06F15/00

    摘要: A multi-mode register rename mechanism which allows a simultaneous multi-threaded processor to support full out-of-order thread execution when the number of threads is low and in-order thread execution when the number of threads increases. Responsive to changing an execution mode of a processor to operate in in-order thread execution mode, the illustrative embodiments switch a physical register in the data processing system to an architected facility, thereby forming a switched physical register. When an instruction is issued to an execution unit, wherein the issued instruction comprises a thread bit, the thread bit is examined to determine if the instruction accesses an architected facility. If the issued instruction accesses an architected facility, the instruction is executed, and the results of the executed instruction are written to the switched physical register.

    摘要翻译: 多模式寄存器重命名机制,允许同时多线程处理器在线程数量低时支持完全无序的线程执行,并且当线程数增加时按顺序执行线程。 响应于改变处理器的执行模式以按顺序执行线程执行模式,所述说明性实施例将数据处理系统中的物理寄存器切换到架构设施,从而形成切换的物理寄存器。 当向执行单元发出指令时,其中发出的指令包括一个线程位,检查该线程位以确定该指令是否访问一个架构设施。 如果发出的指令访问架构设施,则执行该指令,并且将所执行的指令的结果写入切换的物理寄存器。

    Configurable Microprocessor
    66.
    发明申请
    Configurable Microprocessor 审中-公开
    可配置微处理器

    公开(公告)号:US20080229065A1

    公开(公告)日:2008-09-18

    申请号:US11685428

    申请日:2007-03-13

    IPC分类号: G06F9/30

    摘要: A configurable microprocessor which combines a plurality of corelets into a single microprocessor core to handle high computing-intensive workloads. The process first selects two or more corelets in the plurality of corelets. The process combines resources of the two or more corelets to form combined resources, wherein each combined resource comprises a larger amount of a resource available to each individual corelet. The process then forms a single microprocessor core from the two or more corelets by assigning the combined resources to the single microprocessor core, wherein the combined resources are dedicated to the single microprocessor core, and wherein the single microprocessor core processes instructions with the dedicated combined resources.

    摘要翻译: 一种可配置的微处理器,将多个核心组合成单个微处理器核心,以处理高计算密集型工作负载。 该过程首先在多个核心中选择两个或更多个核心。 该过程组合两个或更多个核心小区的资源以形成组合的资源,其中每个组合的资源包括更大量的可用于每个单个小堆的资源。 然后,该过程通过将组合的资源分配给单个微处理器核心而从两个或更多个核心小区形成单个微处理器核心,其中组合资源专用于单个微处理器核心,并且其中单个微处理器核心使用专用组合资源来处理指令 。

    Instruction grouping history on fetch-side dispatch group formation
    67.
    发明授权
    Instruction grouping history on fetch-side dispatch group formation 失效
    指令分组历史在抓取方调度组的形成

    公开(公告)号:US07269715B2

    公开(公告)日:2007-09-11

    申请号:US11050344

    申请日:2005-02-03

    IPC分类号: G06F9/38

    摘要: An improved method, apparatus, and computer instructions for grouping instructions processed in equal sized sets. A current set of instructions is received in an instruction cache for dispatching. A determination is made as to whether any instructions in the current set of instructions are part of a group including a prior set of instructions received in the instruction cache including using a history data structure, wherein the history data structure contains data regarding instructions in the prior set of instructions. Any instructions are grouped into the group with the instruction in response to a determination that the any instructions are part of the group. Instructions in the group units are dispatched to execution using the history data structure, wherein invalid instruction dispatch groupings are avoided.

    摘要翻译: 一种改进的方法,装置和计算机指令,用于对在相同大小的集合中处理的指令进行分组。 在指令高速缓存中接收当前的一组指令用于调度。 确定当前指令集中的任何指令是否包括包括使用历史数据结构在指令高速缓存中接收的先前指令集的组的一部分,其中历史数据结构包含关于先前的指令的数据 一套说明 响应于确定任何指令是组的一部分,任何指令被分组到具有指令的组中。 使用历史数据结构将分组单元中的指令调度到执行,其中避免了无效指令分派分组。

    Counting latencies of an instruction table flush, refill and instruction execution using a plurality of assigned counters
    68.
    发明授权
    Counting latencies of an instruction table flush, refill and instruction execution using a plurality of assigned counters 失效
    使用多个分配的计数器计数指令表的等待时间,刷新,补充和指令执行

    公开(公告)号:US06970999B2

    公开(公告)日:2005-11-29

    申请号:US10210415

    申请日:2002-07-31

    IPC分类号: G06F9/38 G06F9/44 G06F15/00

    摘要: A method and system for analyzing cycles per instruction (CPI) performance in a processor. A completion table corresponds to the instructions in a group to be processed by the processor. An empty completion table indicates that there has been some type of catastrophe that caused a table flush. While the table is empty, a performance monitoring counter (PMC), located in a performance monitoring unit (PMU) in the processor, counts the number of clock cycles that the table is empty. Preferably, a separate PMC is utilized depending on the reason that the completion table is empty. A second PMC likewise counts the number of clock cycles spent re-filling the empty completion table. A third PMC counts the number of clock cycles spent actually executing the instructions in the completion table. The information in the PMC's can be used to evaluate the true cause for degradation of CPI performance.

    摘要翻译: 一种用于分析处理器中每条指令(CPI)性能的循环的方法和系统。 完成表对应于要由处理器处理的组中的指令。 一个空的完成表表明有一些类型的灾难导致表冲洗。 当表为空时,位于处理器中的性能监视单元(PMU)中的性能监视计数器(PMC)会计数表为空的时钟周期数。 优选地,根据完成表为空的原因,使用单独的PMC。 第二个PMC同样计算重新填充空完成表的时钟周期数。 第三个PMC计算在完成表中实际执行指令花费的时钟周期数。 PMC中的信息可用于评估CPI性能下降的真正原因。

    Superscalar processor and method for incrementally issuing store instructions
    69.
    发明授权
    Superscalar processor and method for incrementally issuing store instructions 失效
    超标量处理器和递增发出存储指令的方法

    公开(公告)号:US06463524B1

    公开(公告)日:2002-10-08

    申请号:US09383607

    申请日:1999-08-26

    IPC分类号: G06F13364

    摘要: A superscalar processor and method are disclosed for efficiently executing a store instruction. The store instruction is stored in an issue queue within the processor. A first part of the store instruction is issued from the issue queue to a first one of different execution units in response to a first operand becoming available. A second part of the store instruction is issued from the issue queue to a second one of the different execution units in response to a second operand becoming available. The store instruction is completed in response to executing the first part of the store instruction by the first one of the execution units and the second part of the store instruction by the second one of the execution units.

    摘要翻译: 公开了一种用于有效执行存储指令的超标量处理器和方法。 存储指令存储在处理器内的问题队列中。 响应于第一操作数变得可用,存储指令的第一部分从发布队列发送到不同执行单元中的第一个。 响应于第二操作数变得可用,存储指令的第二部分从发布队列发送到不同执行单元中的第二部分。 存储指令响应于由第一执行单元执行存储指令的第一部分和由第二执行单元执行存储指令的第二部分。

    Just-in-time register renaming technique
    70.
    发明授权
    Just-in-time register renaming technique 失效
    即时注册重命名技术

    公开(公告)号:US06311267B1

    公开(公告)日:2001-10-30

    申请号:US09196908

    申请日:1998-11-20

    IPC分类号: G06F938

    摘要: A target register of an instruction is assigned a rename register in response to the instruction being issued. That is, the target register is renamed at issue time, not at dispatch time. To handle a new deadlock issue this gives rise to, rename register allocation/deallocation logic, according to the present invention, includes logic for allocating and deallocating two sets of rename registers, one set from a regular rename buffer and another set from an overflow rename buffer. According to this allocation/deallocation logic, if the oldest dispatched, noncompleted instruction is ready for assignment of a rename register and the regular rename buffer is full, then a rename register is assigned from the overflow rename buffer to this instruction.

    摘要翻译: 指令的目标寄存器响应于正在发出的指令被分配重命名寄存器。 也就是说,目标注册表在发布时更名,而不是在调度时间。 为了处理新的死锁问题,这导致根据本发明重新命名寄存器分配/释放逻辑,包括用于分配和重新分配两组重命名寄存器的逻辑,一组来自常规重命名缓冲器,另一组来自溢出重命名 缓冲。 根据这种分配/释放逻辑,如果最旧的已分派的未完成指令准备好重新命名寄存器的分配,并且常规重命名缓冲区已满,则将重命名寄存器从溢出重命名缓冲区分配给该指令。