Store to load forward predictor training using delta tag
    121.
    发明授权
    Store to load forward predictor training using delta tag 有权
    存储使用delta标签加载预测器训练

    公开(公告)号:US06622237B1

    公开(公告)日:2003-09-16

    申请号:US09476192

    申请日:2000-01-03

    IPC分类号: G06F900

    CPC分类号: G06F9/3834 G06F9/3838

    摘要: A processor employs a store to load forward (STLF) predictor which may indicate, for dispatching loads, a dependency on a store. The dependency is indicated for a store which, during a previous execution, interfered with the execution of the load. Since a dependency is indicated on the store, the load is prevented from scheduling and/or executing prior to the store. The STLF predictor is trained with information for a particular load and store in response to executing the load and store and detecting the interference. Additionally, the STLF predictor may be untrained (e.g. information for a particular load and store may be deleted) if a load is indicated by the STLF predictor as dependent upon a particular store and the dependency does not actually occur. In one implementation, the STLF predictor records at least a portion of the PC of a store which interferes with the load in a first table indexed by the load PC. A second table maintains a corresponding portion of the store PCs of recently dispatched stores, along with tags identifying the recently dispatched stores. In another implementation, the STLF predictor records a difference between the tags assigned to a load and a store which interferes with the load in a first table indexed by the load PC. The PC of the dispatching load is used to select a difference from the table, and the difference is added to the tag assigned to the load.

    摘要翻译: 处理器使用存储来加载(STLF)预测器,其可以指示用于调度负载对存储的依赖性。 对于在先前执行期间干扰负载的执行的存储器,指示依赖性。 由于在存储器上指示依赖关系,所以在存储之前防止了负载的调度和/或执行。 响应于执行负载并存储和检测干扰,STLF预测器被训练用于特定负载和存储的信息。 此外,如果由STLF预测器指示负载依赖于特定存储并且实际上不发生依赖性,则STLF预测器可以是未经训练的(例如,针对特定负载的信息可以被删除)。 在一个实现中,STLF预测器在由负载PC索引的第一表中记录干扰负载的商店的PC的至少一部分。 第二个表维护最近派驻的商店的商店PC的相应部分,以及标识最近派发的商店的标签。 在另一实现中,STLF预测器记录分配给负载的标签与由负载PC索引的第一表中的负载干扰的存储器之间的差异。 调度负载的PC用于选择与表的差异,并将差值添加到分配给负载的标签。

    Computer system implementing a system and method for ordering input/output (IO) memory operations within a coherent portion thereof
    122.
    发明授权
    Computer system implementing a system and method for ordering input/output (IO) memory operations within a coherent portion thereof 有权
    实现用于在其相干部分内对输入/输出(IO)存储器操作进行排序的系统和方法的计算机系统

    公开(公告)号:US06557048B1

    公开(公告)日:2003-04-29

    申请号:US09431364

    申请日:1999-11-01

    IPC分类号: G06F1300

    CPC分类号: G06F13/4059

    摘要: A computer system is presented which implements a system and method for ordering input/output (I/O) memory operations. In one embodiment, the computer system includes a processing subsystem and an I/O subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor executing software instructions. The I/O subsystem includes one or more I/O nodes serially coupled via non-coherent communication links. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). One of the processing nodes includes a host bridge which translates packets moving between the processing subsystem and the I/O subsystem. One of the I/O nodes is coupled to the processing node including the host bridges. The I/O node coupled to the processing node produces and/or provides transactions having destinations or targets within the processing subsystem to the processing node including the host bridge. The I/O node may, for example, produce and/or provide a first transaction followed by a second transaction. The host bridge may dispatch the second transaction with respect to the first transaction according to a predetermined set of ordering rules. For example, the host bridge may: (i) receive the first and second transactions, (ii) dispatch the first transaction within the processing subsystem, and (iii) dispatch the second transaction within the processing subsystem dependent upon progress of the first transaction within the processing subsystem and the predetermined set of ordering rules.

    摘要翻译: 提出了一种实现用于排序输入/输出(I / O)存储器操作的系统和方法的计算机系统。 在一个实施例中,计算机系统包括处理子系统和I / O子系统。 处理子系统包括通过相干通信链路互连的多个处理节点。 每个处理节点可以包括执行软件指令的处理器。 I / O子系统包括通过非相干通信链路串联耦合的一个或多个I / O节点。 每个I / O节点可以体现一个或多个I / O功能(例如,调制解调器,声卡等)。 其中一个处理节点包括一个主机桥,它转换在处理子系统和I / O子系统之间移动的数据包。 其中一个I / O节点耦合到包括主机桥的处理节点。 耦合到处理节点的I / O节点产生和/或提供具有处理子系统内包含主机桥的处理节点的目的地或目标的事务。 I / O节点可以例如产生和/或提供第一事务,随后是第二事务。 主桥可以根据预定的一组排序规则来分派关于第一事务的第二事务。 例如,主桥可以:(i)接收第一和第二事务,(ii)在处理子系统内调度第一事务,以及(iii)根据第一事务的进度在处理子系统内调度第二事务 处理子系统和预定的一套排序规则。

    Computer system implementing flush operation
    123.
    发明授权
    Computer system implementing flush operation 有权
    计算机系统实现冲洗操作

    公开(公告)号:US06553430B1

    公开(公告)日:2003-04-22

    申请号:US09410852

    申请日:1999-10-01

    申请人: James B. Keller

    发明人: James B. Keller

    IPC分类号: G06F1316

    CPC分类号: G06F13/405

    摘要: A computer system is presented which implements a “flush” operation providing a response to a source which signifies that all posted write operations previously issued by the source have been properly ordered within their targets with respect to other pending operations. The computer system includes multiple processing nodes within a processing subsystem and at least one input/output (I/O) node coupled to a processing node including a host bridge. The host bridge receives non-coherent posted write commands from the I/O node and responsively generates corresponding coherent posted write commands within the processing subsystem. Each posted write command has a target within the processing subsystem. The host bridge includes a data buffer for storing data used to track the status of non-coherent posted write commands. The I/O node issues a flush command to ensure that all previously issued non-coherent posted write commands have at least reached points of coherency within the processing subsystem. The host bridge issues a non-coherent target done response to the I/O node in response to: (i) the flush command, and (ii) coherent target done responses received from all targets of posted write commands previously issued by the I/O node. Coherent target done responses signify write commands have at least reached points of coherency within the processing subsystem. The non-coherent target done response signals the I/O node that all non-coherent posted write commands previously issued by the I/O node have at least reached points of coherency within the processing subsystem.

    摘要翻译: 提供了一种计算机系统,其实现“冲洗”操作,其提供对源的响应,其表示以前由源发出的所有发布的写入操作已经相对于其他挂起的操作在其目标内被正确地排序。 计算机系统包括处理子系统内的多个处理节点和耦合到包括主机桥的处理节点的至少一个输入/输出(I / O)节点。 主机桥从I / O节点接收非相干的写入命令,并在处理子系统内响应地生成相应的相干的写入命令。 每个发布的写入命令在处理子系统内都有一个目标。 主桥包括用于存储用于跟踪非相干发布的写入命令的状态的数据的数据缓冲器。 I / O节点发出flush命令,以确保所有先前发布的非相干发布的写入命令至少达到处理子系统内的一致性点。 响应于:(i)flush命令和(ii)从I / O节点先前发布的已发布的写入命令的所有目标接收的相干目标完成响应,主桥发出对I / O节点的非相干目标完成响应, O节点。 相干目标完成响应表示写入命令至少达到处理子系统内的一致性点。 非相干目标完成响应信号I / O节点,以前由I / O节点发出的所有非相干发布的写命令至少达到处理子系统内的一致性点。

    Method and apparatus for performing speculative memory fills into a microprocessor
    124.
    发明授权
    Method and apparatus for performing speculative memory fills into a microprocessor 失效
    用于执行推测性存储器填充到微处理器的方法和装置

    公开(公告)号:US06493802B1

    公开(公告)日:2002-12-10

    申请号:US09099396

    申请日:1998-06-18

    IPC分类号: G06F1212

    摘要: According to the present invention a cache within a multiprocessor system is speculatively filled. To speculatively fill a designated cache, the present invention first determines an address which identifies information located in a main memory. The address may also identify one or more other versions of the information located in one or more caches. The process of filling the designated cache with the information is started by locating the information in the main memory and locating other versions of the information identified by the address in the caches. The validity of the information located in the main memory is determined after locating the other versions of the information. The process of filling the designated cache with the information located in the main memory is initiated before determining the validity of the information located in main memory. Thus, the memory reference is speculative.

    摘要翻译: 根据本发明,推测性地填充多处理器系统内的高速缓存。 为了推测地填充指定的高速缓存,本发明首先确定识别位于主存储器中的信息的地址。 地址还可以标识位于一个或多个高速缓存中的信息的一个或多个其他版本。 通过将信息定位在主存储器中并定位在该高速缓存中由该地址识别的信息的其他版本来启动用信息填充指定高速缓存的过程。 位于主存储器中的信息的有效性是在查找信息的其他版本之后确定的。 在确定位于主存储器中的信息的有效性之前启动用位于主存储器中的信息填充指定高速缓存的过程。 因此,内存引用是推测性的。

    Method and apparatus for optimizing bcache tag performance by inferring bcache tag state from internal processor state
    125.
    发明授权
    Method and apparatus for optimizing bcache tag performance by inferring bcache tag state from internal processor state 失效
    通过从内部处理器状态推断bcache标签状态来优化bcache标签性能的方法和装置

    公开(公告)号:US06401173B1

    公开(公告)日:2002-06-04

    申请号:US09237519

    申请日:1999-01-26

    IPC分类号: G06F1200

    CPC分类号: G06F12/0897

    摘要: An architecture which splits primary and secondary cache memory buses and maintains cache hierarchy consistency without performing an explicit invalidation of the secondary cache tag. Two explicit rules are used to determine the status of a block read from the primary cache. In particular, if any memory reference subset matches a block in the primary cache, the associated secondary cache block is ignored. Secondly, if any memory reference subset matches a block in the miss address file, the associated secondary cache block is ignored. Therefore, any further references which subset match the first reference are not allowed to proceed until the fill back to main memory has been completed and the associated miss address file entry has been retired. This ensures that no agent in the host processor or an external agent can illegally use the stale secondary cache data.

    摘要翻译: 分割主缓冲存储器总线和次高速缓存存储器总线并维持高速缓存层次一致性而不执行次级高速缓存标签的明确无效的架构。 使用两个显式规则来确定从主缓存读取的块的状态。 特别地,如果任何存储器引用子集与主缓存中的块匹配,则相关联的二级高速缓存块将被忽略。 其次,如果任何存储器引用子集匹配未命中地址文件中的块,则关联的二级高速缓存块将被忽略。 因此,任何进一步的引用哪个子集匹配第一个引用不允许继续,直到填充回主存储器已经完成并且相关的未命中地址文件条目已经退休。 这确保主机处理器或外部代理中的代理不会非法使用过时的二级高速缓存数据。

    Method and apparatus for developing multiprocessor cache control protocols using a memory management system generating atomic probe commands and system data control response commands
    126.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols using a memory management system generating atomic probe commands and system data control response commands 失效
    使用生成原子探测命令和系统数据控制响应命令的存储器管理系统开发多处理器缓存控制协议的方法和装置

    公开(公告)号:US06349366B1

    公开(公告)日:2002-02-19

    申请号:US09099385

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0815

    摘要: A memory management system couples processors to each other and to a main memory. Each processor may have one or more associated caches local to that processor. A system port of the memory management system receives a request from a source processor of the processors to access a block of data from the main memory. A memory manager of the memory management system then converts the request into a probe command having a data movement part identifying a condition for movement of the block out of a cache of a target processor and a next coherence state part indicating a next state of the block in the cache of the target processor.

    摘要翻译: 存储器管理系统将处理器彼此耦合到主存储器。 每个处理器可以具有该处理器本地的一个或多个相关联的高速缓存。 存储器管理系统的系统端口接收来自处理器的源处理器的请求以从主存储器访问数据块。 存储器管理系统的存储器管理器然后将该请求转换成具有数据移动部分的探测命令,该数据移动部分标识出用于从目标处理器的高速缓存中移出块的条件,以及指示块的下一个状态的下一个相干状态部分 在目标处理器的缓存中。

    Physical rename register for efficiently storing floating point, integer, condition code, and multimedia values
    127.
    发明授权
    Physical rename register for efficiently storing floating point, integer, condition code, and multimedia values 有权
    物理重命名寄存器,用于高效存储浮点数,整数,条件码和多媒体值

    公开(公告)号:US06266763B1

    公开(公告)日:2001-07-24

    申请号:US09225982

    申请日:1999-01-05

    IPC分类号: G06F9312

    摘要: A register renaming apparatus includes one or more physical registers which may be assigned to store a floating point value, a multimedia value, an integer value and corresponding condition codes, or condition codes only. The classification of the instruction (e.g. floating point, multimedia, integer, flags-only) defines which lookahead register state is updated (e.g. floating point, integer, flags, etc.), but the physical register can be selected from the one or more physical registers for any of the instruction types. Determining if enough physical registers are free for assignment to the instructions being selected for dispatch includes considering the number of instructions selected for dispatch and the number of free physical registers, but excludes the data type of the instruction. When a code sequence includes predominately instructions of a particular data type, many of the physical registers may be assigned to that data type (efficiently using the physical register resource). By contrast, if different sets of physical registers are provided for different data types, only the physical registers used for the particular data type may be used for the aforementioned code sequence. Additional efficiencies may be realized in embodiments in which an integer register and condition codes are both updated by many instructions. One physical register may concurrently represent the architected state of both the flags register and the integer register. Accordingly, a given functional unit may forward a single physical register number for both results.

    摘要翻译: 寄存器重命名装置包括一个或多个物理寄存器,其可被分配用于仅存储浮点值,多媒体值,整数值和相应的条件代码或条件代码。 指令的分类(例如浮点,多媒体,整数,仅标志)定义哪个前瞻寄存器状态被更新(例如浮点,整数,标志等),但物理寄存器可以从一个或多个 任何指令类型的物理寄存器。 确定是否有足够的物理寄存器用于分配给选择用于调度的指令,包括考虑选择用于调度的指令数量和空闲物理寄存器的数量,但不包括指令的数据类型。 当代码序列主要包括特定数据类型的指令时,许多物理寄存器可被分配给该数据类型(有效地使用物理寄存器资源)。 相比之下,如果针对不同的数据类型提供不同的物理寄存器集合,则只有用于特定数据类型的物理寄存器可以用于上述代码序列。 在其中整数寄存器和条件码都被许多指令更新的实施例中可以实现额外的效率。 一个物理寄存器可以同时表示标志寄存器和整数寄存器的架构状态。 因此,给定的功能单元可以转发两个结果的单个物理寄存器号。

    Circuit and method for maintaining order of memory access requests
initiated by devices coupled to a multiprocessor system

    公开(公告)号:US6167492A

    公开(公告)日:2000-12-26

    申请号:US220487

    申请日:1998-12-23

    IPC分类号: G06F13/16 G06F13/00 G06F12/00

    CPC分类号: G06F13/1621

    摘要: A circuit and method is disclosed for preserving the order for memory requests originating from I/O devices coupled to a multiprocessor computer system. The multiprocessor computer system includes a plurality of circuit nodes and a plurality of memories. Each circuit node includes at least one microprocessor coupled to a memory controller which in turn is coupled to one of the plurality of memories. The circuit nodes are in data communication with each other, each circuit node being uniquely identified by a node number. At least one of the circuit nodes is coupled to an I/O bridge which in turn is coupled directly or indirectly to one or more I/O devices. The I/O bridge generates non-coherent memory access transactions in response to memory access requests originating with one of the I/O devices. The circuit node coupled to the I/O bridge, receives the non-coherent memory access transactions. For example, the circuit node coupled to the I/O bridge receives first and second non-coherent memory access transactions. The first and second non-coherent memory access transactions include first and second memory addresses, respectively. The first and second non-coherent memory access transactions further include first and second pipe identifications, respectively. The node circuit maps the first and second memory addresses to first and second node numbers, respectively. The first and second pipe identifications are compared. If the first and second pipe identifications compare equally, then the first and second node numbers are compared. First and second coherent memory access transactions are generated by the node coupled to the I/O bridge wherein the first and second coherent memory access transactions correspond to the first and second non-coherent memory access transactions, respectively. The first coherent memory access transaction is transmitted to one of the nodes of the multiprocessor computer system. However, the second coherent memory access transaction is not transmitted unless the first and second pipe identifications do not compare equally or if the first and second node numbers compare equally.

    Method and apparatus for accelerated addition of sliced addends
    129.
    发明授权
    Method and apparatus for accelerated addition of sliced addends 失效
    用于加速加入切片加法的方法和装置

    公开(公告)号:US4878193A

    公开(公告)日:1989-10-31

    申请号:US176594

    申请日:1988-04-01

    摘要: The invention is directed to a method and circuit for performing an addition operation in successive pipelined instructions which utilize a sliced ALU. Successive microinstructions are monitored to determine if both microinstructions are add operations. Further, it is determined whether the use of the destination of the first microinstruction is a source for the add operation in the second microinstruction. If both microinstructions are add operations and the destination of the first microinstruction is used as the source for the second microinstruction and one of the addends of the second microinstruction is a small addend then the circuit detects whether a carry-out occurred in the least significant slice of the second instruction. If there is no carry-out, the result for the more significant slice of the second microinstruction answer. However, if a carry-out was detected, then the result for the second microinstruction's more significant slice is the sum+1 of the second microinstruction.

    摘要翻译: 本发明涉及一种用于在使用切片ALU的连续流水线指令中执行加法运算的方法和电路。 监视连续微指令以确定两个微指令是否都是添加操作。 此外,确定第一微指令的目的地的使用是否是第二微指令中的添加操作的源。 如果两个微指令是相加操作,并且第一微指令的目的地被用作第二微指令的源,并且第二微指令的加数中的一个是小加数,则该电路检测在最不重要的切片中是否发生进位 的第二条指令。 如果没有进行实现,结果是更重要的第二个微指令的回答。 然而,如果检测到进位,则第二微指令的更重要的切片的结果是第二微指令的和+1。

    Apparatus and method for responding to an aborted signal exchange
between subsystems in a data processing system
    130.
    发明授权
    Apparatus and method for responding to an aborted signal exchange between subsystems in a data processing system 失效
    用于响应数据处理系统中的子系统之间中止的信号交换的装置和方法

    公开(公告)号:US4858173A

    公开(公告)日:1989-08-15

    申请号:US823775

    申请日:1986-01-29

    CPC分类号: G06F13/364

    摘要: In a data processing system in which access to a second unit by a first unit through a system bus is determined by an arbitration unit, when a requesting unit that receives access to the system bus is unable to use that access for interaction with the second unit, a busy signal is provided to the arbitration unit and to the units. The busy signal causes the units to reinstitute a request for access to the system bus when the subsystem had an aborted transaction. The busy signal enforces a delay in the next arbitration for the system bus until a unit, with an aborted transaction as a result of the busy signal, can reassert the request for access signal. Moreover, apparatus can be included with the arbitration unit that permits rearbitrating access to the bus using the priority conditions in effect at the time of the original arbitration.

    摘要翻译: 在数据处理系统中,由仲裁单元确定由第一单元通过系统总线访问第二单元的数据处理系统,当接收对系统总线的访问的请求单元不能使用与第二单元交互的访问时 向仲裁单元和单元提供忙信号。 当子系统中止事务时,繁忙的信号导致单元重新建立访问系统总线的请求。 忙信号在系统总线的下一个仲裁中强制执行延迟,直到由于忙信号而导致中止事务的单元可以重新发送接入信号请求。 此外,可以在允许使用在原始仲裁时有效的优先级条件使总线访问总线的仲裁单元中包含设备。