System and method of increasing bandwidth for issuing ordered transactions into a distributed communication system
    61.
    发明授权
    System and method of increasing bandwidth for issuing ordered transactions into a distributed communication system 有权
    增加带宽的系统和方法,用于将有序交易发布到分布式通信系统中

    公开(公告)号:US06745272B2

    公开(公告)日:2004-06-01

    申请号:US09826262

    申请日:2001-04-04

    IPC分类号: G06F1300

    CPC分类号: H04L1/1671

    摘要: A method and system of expediting issuance of a second request of a pair of ordered requests into a distributed coherent communication fabric. The first request of the ordered pair is issued into the coherent communication fabric and directed to a first target. Issuance of the second request into the coherent communication fabric is stalled until the first target receives and orders the first request and transmits a response acknowledging the same.

    摘要翻译: 一种将一对有序请求的第二请求发布到分布式相干通信结构中的方法和系统。 有序对的第一个请求被发布到相干通信结构中,并被引导到第一个目标。 将第二个请求发送到相干通信结构中停止,直到第一个目标接收并订购第一个请求并发送确认该响应的响应。

    Conserving system memory bandwidth during a memory read operation in a multiprocessing computer system
    62.
    发明授权
    Conserving system memory bandwidth during a memory read operation in a multiprocessing computer system 有权
    在多处理计算机系统中的存储器读取操作期间节省系统存储器带宽

    公开(公告)号:US06728841B2

    公开(公告)日:2004-04-27

    申请号:US10002753

    申请日:2001-10-31

    申请人: James B. Keller

    发明人: James B. Keller

    IPC分类号: G06F1300

    CPC分类号: G06F12/0813

    摘要: A messaging scheme that conserves system memory bandwidth during a memory read operation in a multiprocessing computer system is described. A source processing node sends a memory read command to a target processing node to read data from a designated memory location in a system memory associated with the target processing node. The target node transmits a read response to the source node containing the requested data and also concurrently transmits a probe command to one or more of the remaining nodes in the multiprocessing computer system. In response to the probe command each remaining processing node checks whether the processing node has a cached copy of the requested data. If a processing node, other than the source and the target nodes, finds a modified cached copy of the designated memory location, that processing node responds with a memory cancel response sent to the target node and a read response sent to the source node. The read response contains the modified cache block containing the requested data, and the memory cancel response causes the target node to abort further processing of the memory read command, and to stop transmission of the read response, if the target node hasn't transmitted the read response yet. The memory cancel message thus attempts to avoid relatively lengthy and time-consuming system memory accesses when the system memory has a stale data.

    摘要翻译: 描述了在多处理计算机系统中的存储器读取操作期间节省系统存储器带宽的消息传递方案。 源处理节点向目标处理节点发送存储器读取命令,以从与目标处理节点相关联的系统存储器中的指定存储器位置读取数据。 目标节点向包含所请求数据的源节点发送读取响应,并且还向多处理计算机系统中的一个或多个剩余节点发送探测命令。 响应于探测命令,每个剩余的处理节点检查处理节点是否具有所请求数据的缓存副本。 如果除了源节点和目标节点之外的处理节点找到指定的存储器位置的经修改的缓存副本,则该处理节点以发送到目标节点的存储器取消响应和发送到源节点的读取响应进行响应。 读取响应包含包含所请求数据的经修改的高速缓存块,并且存储器取消响应导致目标节点中止对存储器读取命令的进一步处理,并且停止读取响应的传输,如果目标节点尚未发送 阅读回应。 因此,当系统存储器具有陈旧数据时,存储器取消消息尝试避免相对冗长且耗时的系统存储器访问。

    Method and apparatus for developing multiprocessor cache control protocols using an external acknowledgement signal to set a cache to a dirty state
    63.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols using an external acknowledgement signal to set a cache to a dirty state 失效
    用于使用外部确认信号开发多处理器高速缓存控制协议以将高速缓存设置为脏状态的方法和装置

    公开(公告)号:US06651144B1

    公开(公告)日:2003-11-18

    申请号:US09099384

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0815 G06F12/0817

    摘要: A computer system includes an external unit governing a cache which generates a set-dirty request as a function of a coherence state of a block in the cache to be modified. The external unit modifies the block of the cache only if an acknowledgment granting permission is received from a memory management system responsive to the set-dirty request. The memory management system receives the set-dirty request, determines the acknowledgment based on contents of the plurality of caches and the main memory according to a cache protocol and sends the acknowledgment to the external unit in response to the set-dirty request. The acknowledgment will either grant permission or deny permission to set the block to the dirty state.

    摘要翻译: 计算机系统包括管理高速缓存的外部单元,该高速缓冲存储器根据要修改的高速缓存中的块的相干状态生成设置脏请求。 仅当响应于设置的脏请求从存储器管理系统接收到确认授权许可时,外部单元才修改高速缓存块。 存储器管理系统接收设置脏请求,根据高速缓存协议基于多个高速缓存和主存储器的内容确定确认,并响应于设置脏请求将确认发送到外部单元。 该确认将授予权限或拒绝许可将块设置为脏状态。

    Training line predictor for branch targets
    64.
    发明授权
    Training line predictor for branch targets 有权
    分支目标的训练线预测

    公开(公告)号:US06647490B2

    公开(公告)日:2003-11-11

    申请号:US09419832

    申请日:1999-10-14

    IPC分类号: G06F938

    摘要: A line predictor caches alignment information for instructions. In response to each fetch address, the line predictor provides alignment information for the instruction beginning at the fetch address, as well as one or more additional instructions subsequent to that instruction. The line predictor may include a memory having multiple entries, each entry storing up to a predefined maximum number of instruction pointers and a fetch address corresponding to the instruction identified by a first one of the instruction pointers. Additionally, each entry may include a link to another entry storing instruction pointers to the next instructions within the predicted instruction stream, and a next fetch address corresponding to the first instruction within the next entry. The next fetch address may be provided to the instruction cache to fetch the corresponding instruction bytes. If the terminating instruction within the entry is a branch instruction, the line predictor is trained with respect to the next fetch address (and next index within the line predictor, which provides the link to the next entry). As line predictor entries are created, a set of branch predictors may be accessed to provide an initial next fetch address and index. The initial training is verified by accessing the branch predictors at each fetch of the line predictor entry, and updated as dictated by the state of the branch predictors at each fetch.

    摘要翻译: 行预测器缓存对齐信息的指令。 响应于每个提取地址,行预测器提供从取指址开始的指令的对齐信息,以及该指令之后的一个或多个附加指令。 线预测器可以包括具有多个条目的存储器,每个条目存储多达预定义的最大数量的指令指针以及与由指令指针中的第一个标识的指令相对应的读取地址。 此外,每个条目可以包括链接到存储指向预测指令流中的下一个指令的指令的另一个条目,以及对应于下一条目中的第一指令的下一个提取地址。 可以将下一个提取地址提供给指令高速缓存以获取对应的指令字节。 如果条目中的终止指令是分支指令,则线路预测器相对于下一个提取地址(以及行预测器中的下一个索引(其提供到下一个条目的链接))进行训练。 当创建线预测值条目时,可以访问一组分支预测器来提供初始的下一个提取地址和索引。 通过在行预测器条目的每次获取时访问分支预测器来验证初始训练,并且根据每次获取时分支预测器的状态来更新初始训练。

    Scheduler which discovers non-speculative nature of an instruction after issuing and reissues the instruction
    65.
    发明授权
    Scheduler which discovers non-speculative nature of an instruction after issuing and reissues the instruction 有权
    调度程序在发布和重新发出指令后发现指令的非推测性质

    公开(公告)号:US06564315B1

    公开(公告)日:2003-05-13

    申请号:US09476322

    申请日:2000-01-03

    IPC分类号: G06F9312

    摘要: A scheduler issues instruction operations for execution, but also retains the instruction operations. If a particular instruction operation is subsequently found to be required to execute non-speculatively, the particular instruction operation is still stored in the scheduler. Subsequent to determining that the particular operation has become non-speculative (through the issuance and execution of instruction operations prior to the particular instruction operation), the particular instruction operation may be reissued from the scheduler. The penalty for incorrect scheduling of instruction operations which are to execute non-speculatively may be reduced as compared to purging the particular instruction operation and younger instruction operations from the pipeline and refetching the particular instruction operation. Additionally, the scheduler may maintain the dependency indications for each instruction operation which has been issued. If the particular instruction operation is reissued, the instruction operations which are dependent on the particular instruction operation (directly or indirectly) may be identified via the dependency indications. The scheduler reissues the dependent instruction operations as well. Instruction operations which are subsequent to the particular instruction operation in program order but which are not dependent on the particular instruction operation are not reissued. Accordingly, the penalty for incorrect scheduling of instruction operations which are to be executed non-speculatively may be further decreased over the purging of the particular instruction and all younger instruction operations and refetching the particular instruction operation.

    摘要翻译: 调度器执行指令操作,还保留指令操作。 如果随后发现特定的指令操作被要求非推测地执行,则特定的指令操作仍然存储在调度器中。 在确定特定操作已经变得不推测(通过在特定指令操作之前发出和执行指令操作)之后,特定指令操作可以从调度器重新发行。 与从流水线中清除特定的指令操作和较年轻的指令操作以及重新指定特定的指令操作相比,与推测性地执行的指令操作的不正确调度的惩罚可能会减少。 此外,调度器可以维护已经发出的每个指令操作的依赖指示。 如果重新发出特定的指令操作,则可以通过依赖指示来识别依赖于特定指令操作(直接或间接)的指令操作。 调度程序也重新发出相关的指令操作。 不按照程序顺序执行特定指令操作但不依赖于特定指令操作的指令操作不重新发行。 相应地,可以通过清除特定指令和所有更年轻的指令操作并重新获取特定的指令操作来进一步减少非推测性地执行不正确地调度指令操作的惩罚。

    Computer system implementing system and method for ordering write operations and maintaining memory coherency
    66.
    发明授权
    Computer system implementing system and method for ordering write operations and maintaining memory coherency 有权
    用于排序写入操作和维持内存一致性的计算机系统实现系统和方法

    公开(公告)号:US06529999B1

    公开(公告)日:2003-03-04

    申请号:US09428642

    申请日:1999-10-27

    IPC分类号: G06F1200

    CPC分类号: G06F12/0813 G06F12/0831

    摘要: A computer system is presented implementing a system and method for properly ordering write operations. The system and method for properly ordering write operations aids in maintaining memory coherency within the computer system. The computer system includes multiple interconnected processing nodes. One or more of the processing nodes includes a central processing unit (CPU) and/or a cache memory, and one or more of the processing nodes includes a memory controller coupled to a memory. The CPU or cache generates a write command to store data within the memory. The memory controller receives the write command and responds to the write command by issuing a target done response to the CPU or cache after the memory controller: (i) properly orders the write command within the memory controller with respect to other commands pending within the memory controller, and (ii) determines that a coherency state with respect to the write command has been established within the computer system.

    摘要翻译: 提出了一种实现用于正确排序写入操作的系统和方法的计算机系统。 用于正确排序写入操作的系统和方法有助于维持计算机系统内的内存一致性。 计算机系统包括多个互连的处理节点。 一个或多个处理节点包括中央处理单元(CPU)和/或高速缓存存储器,并且一个或多个处理节点包括耦合到存储器的存储器控​​制器。 CPU或缓存生成写入命令以将数据存储在存储器中。 存储器控制器接收写入命令并通过在存储器控制器之后向CPU或高速缓冲存储器发出目标完成响应来响应写入命令:(i)相对于存储器内的其他命令正确地命令存储器控制器内的写入命令 控制器,以及(ii)确定在计算机系统内已经建立了相对于写命令的一致性状态。

    Unload counter adjust logic for a receiver buffer
    67.
    发明授权
    Unload counter adjust logic for a receiver buffer 有权
    接收缓冲区的卸载计数器调整逻辑

    公开(公告)号:US06434640B1

    公开(公告)日:2002-08-13

    申请号:US09320134

    申请日:1999-05-25

    申请人: James B. Keller

    发明人: James B. Keller

    IPC分类号: G06F300

    CPC分类号: G06F5/14

    摘要: A computer system employs a distributed set of links between processing nodes (each processing node including at least one processor). Each link includes a clock signal which is transmitted with and in the same direction as the signals carrying information on the link. The line carrying the clock signal may be matched to the information lines, controlling skew and transport time differences to allow for high frequency operation. Because the clock signals at a transmitter and a receiver may not have a common source, a receive buffer may be employed. Data transmitted across the link is stored into the receive buffer responsive to the transmitter clock signal (e.g. by maintaining a load pointer controlled according to the transmitter clock), and is removed from the buffer responsive to the receiver clock signal (e.g. by maintaining an unload pointer controlled according to the receiver clock). The buffer includes sufficient entries for data to account for clock uncertainties (e.g. skew and jitter). Additionally, the receiver includes unload pointer adjust logic which monitors the transmitter clock signal and the receiver clock signal for differences (e.g. differences in frequency). The unload pointer adjust logic makes adjustments to the unload pointer to account for the differences in the clock signal, and hence to maintain integrity of the data transmitted by preventing the load and unload pointers from overrunning each other in the buffer.

    摘要翻译: 计算机系统使用处理节点之间的分布式链路集合(每个处理节点包括至少一个处理器)。 每个链路包括一个时钟信号,该时钟信号与在该链路上承载信息的信号沿方向相同。 携带时钟信号的线路可以与信息线路匹配,控制偏移和传送时间差以允许高频率操作。 因为在发射机和接收机处的时钟信号可能不具有公共源,所以可以采用接收缓冲器。 响应于发射机时钟信号(例如通过保持根据发射机时钟控制的负载指针),跨链路发送的数据被存储到接收缓冲器中,并且响应于接收机时钟信号而从缓冲器中移除(例如通过保持卸载 指针根据接收机时钟控制)。 缓冲器包括用于数据的足够条目以考虑时钟不确定性(例如,偏斜和抖动)。 此外,接收机包括卸载指针调整逻辑,其监测发射机时钟信号和接收机时钟信号的差异(例如频率差异)。 卸载指针调整逻辑对卸载指针进行调整,以解决时钟信号的差异,从而通过防止加载和卸载指针在缓冲器中超越彼此而传输的数据的完整性。

    Method and apparatus for developing multiprocessor cache control protocols by presenting a clean victim signal to an external system
    68.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols by presenting a clean victim signal to an external system 失效
    通过向外部系统提供干净的受害者信号来开发多处理器缓存控制协议的方法和装置

    公开(公告)号:US06397302B1

    公开(公告)日:2002-05-28

    申请号:US09099304

    申请日:1998-06-18

    IPC分类号: G06F1212

    CPC分类号: G06F12/0822

    摘要: A multiprocessor system includes a plurality of processors, each processor having one or more caches local to the processor, and a memory controller connectable to the plurality of processors and a main memory. The memory controller manages the caches and the main memory of the multiprocessor system. A processor of the multiprocessor system is configurable to evict from its cache a block of data. The selected block may have a clean coherence state or a dirty coherence state. The processor communicates a notify signal indicating eviction of the selected block to the memory controller. In addition to sending a write victim notify signal if the selected block has a dirty coherence state, the processor sends a clean victim notify signal if the selected block has a clean coherence state.

    摘要翻译: 多处理器系统包括多个处理器,每个处理器具有处理器本地的一个或多个高速缓存,以及可连接到多个处理器和主存储器的存储器控​​制器。 存储器控制器管理多处理器系统的高速缓存和主存储器。 多处理器系统的处理器可配置为从其缓存中驱逐数据块。 所选择的块可以具有干净的相干状态或脏相干状态。 处理器将指示所选块的驱逐的通知信号传送到存储器控制器。 如果所选择的块具有脏相干状态,则除了发送写入受害者通知信号之外,如果所选择的块具有干净的相干状态,则处理器发送干净的受害者通知信号。

    Memory cancel response optionally cancelling memory controller's providing of data in response to a read operation
    69.
    发明授权
    Memory cancel response optionally cancelling memory controller's providing of data in response to a read operation 有权
    存储器取消响应可选地取消存储器控制器响应于读取操作提供的数据

    公开(公告)号:US06370621B1

    公开(公告)日:2002-04-09

    申请号:US09217699

    申请日:1998-12-21

    申请人: James B. Keller

    发明人: James B. Keller

    IPC分类号: G06F1300

    CPC分类号: G06F12/0813

    摘要: A messaging scheme that conserves system memory bandwidth during a memory read operation in a multiprocessing computer system is described. A source processing node sends a memory read command to a target processing node to read data from a designated memory location in a system memory associated with the target processing node. The target node transmits a read response to the source node containing the requested data and also concurrently transmits a probe command to one or more of the remaining nodes in the multiprocessing computer system. In response to the probe command each remaining processing node checks whether the processing node has a cached copy of the requested data. If a processing node, other than the source and the target nodes, finds a modified cached copy of the designated memory location, that processing node responds with a memory cancel response sent to the target node and a read response sent to the source node. The read response contains the modified cache block containing the requested data, and the memory cancel response causes the target node to abort further processing of the memory read command, and to stop transmission of the read response, if the target node hasn't transmitted the read response yet. The memory cancel message thus attempts to avoid relatively lengthy and time-consuming system memory accesses when the system memory has a stale data.

    摘要翻译: 描述了在多处理计算机系统中的存储器读取操作期间节省系统存储器带宽的消息传递方案。 源处理节点向目标处理节点发送存储器读取命令,以从与目标处理节点相关联的系统存储器中的指定存储器位置读取数据。 目标节点向包含所请求数据的源节点发送读取响应,并且还向多处理计算机系统中的一个或多个剩余节点发送探测命令。 响应于探测命令,每个剩余的处理节点检查处理节点是否具有所请求数据的缓存副本。 如果除了源节点和目标节点之外的处理节点找到指定的存储器位置的经修改的缓存副本,则该处理节点以发送到目标节点的存储器取消响应和发送到源节点的读取响应进行响应。 读取响应包含包含所请求数据的经修改的高速缓存块,并且存储器取消响应导致目标节点中止对存储器读取命令的进一步处理,并且停止读取响应的传输,如果目标节点尚未发送 阅读回应。 因此,当系统存储器具有陈旧数据时,存储器取消消息尝试避免相对冗长且耗时的系统存储器访问。

    Method and apparatus for resolving probes in multi-processor systems which do not use external duplicate tags for probe filtering
    70.
    发明授权
    Method and apparatus for resolving probes in multi-processor systems which do not use external duplicate tags for probe filtering 失效
    用于解决不使用外部重复标签进行探测过滤的多处理器系统中的探针的方法和装置

    公开(公告)号:US06295583B1

    公开(公告)日:2001-09-25

    申请号:US09099400

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0855 G06F12/0831

    摘要: A processor of a multiprocessor system is configured to transmit a full probe to a cache associated with the processor to transfer data from the stored data of the cache. The data corresponding to the full probe is transferred during a time period. A first tag-only probe is also transmitted to the cache during the same time period to determine if the data corresponding to the tag-only probe is part of the stored data stored in the cache. A stream of probes accesses the cache in two stages. The cache is composed of a tag structure and a data structure. In the first stage, a probe is designated a tag-only probe and accesses the tag structure, but not the data structure, to determine tag information indicating a hit or a miss. In the second stage, if the probe returns tag information indicating a cache hit the probe is designated to be a full probe and accesses the data structure of the cache. If the probe returns tag information indicating a cache miss the probe does not proceed to the second stage.

    摘要翻译: 多处理器系统的处理器被配置为将完整的探测传输到与处理器相关联的高速缓存器以从存储的高速缓存数据传输数据。 在一段时间内传送对应于完整探测器的数据。 在相同的时间段期间,第一标签探针也被发送到高速缓存,以确定对应于仅标签探针的数据是否存储在高速缓存中的存储数据的一部分。 探针流以两个阶段访问缓存。 缓存由标签结构和数据结构组成。 在第一阶段,探针被指定为仅标签探针,并且访问标签结构,而不是数据结构,以确定指示命中或遗漏的标签信息。 在第二阶段中,如果探测器返回指示高速缓存命中的标签信息,则探测器被指定为完整探测器并访问高速缓存的数据结构。 如果探测器返回指示高速缓存未命中的标签信息,则探针不进入第二阶段。