Managing a multi-way associative cache
    41.
    发明授权
    Managing a multi-way associative cache 有权
    管理多路关联缓存

    公开(公告)号:US07237067B2

    公开(公告)日:2007-06-26

    申请号:US10829186

    申请日:2004-04-22

    IPC分类号: G06F12/00

    CPC分类号: G06F12/128

    摘要: Methods for storing replacement data in a multi-way associative cache are disclosed. One method comprises logically dividing the cache's cache sets into segments of at least one cache way; searching a cache set in accordance with a segment search sequence for a segment currently comprising a way which has not yet been accessed during a current cycle of the segment search sequence; searching the current segment in accordance with a way search sequence for a way which has not yet been accessed during a current way search cycle; and storing the replacement data in a first way which has not yet been accessed during a current cycle of the way search sequence. A cache controller that performs such methods is also disclosed.

    摘要翻译: 公开了在多路关联高速缓存中存储替换数据的方法。 一种方法包括将高速缓存的高速缓存集合逻辑划分成至少一个缓存方式的段; 根据段搜索序列搜索当前包括在片段搜索序列的当前周期中尚未被访问的方式的段的高速缓存集; 根据当前方式搜索周期中尚未被访问的方式的搜索顺序搜索当前段; 并且以当前循环的搜索顺序的第一种方式存储替换数据,该方式尚未被访问。 还公开了执行这种方法的高速缓存控制器。

    Livelock prevention by delaying surrender of ownership upon intervening ownership request during load locked / store conditional atomic memory operation
    42.
    发明授权
    Livelock prevention by delaying surrender of ownership upon intervening ownership request during load locked / store conditional atomic memory operation 失效
    在加载锁定/存储条件原子存储器操作期间,通过延迟所有权所有权投降来实现预防行为

    公开(公告)号:US06801986B2

    公开(公告)日:2004-10-05

    申请号:US09933536

    申请日:2001-08-20

    IPC分类号: G06F1200

    摘要: A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction. The processor executes the store conditional instruction, and the processor in response to executing the store conditional instruction tests the lock flag, and if the lock flag is set, writing to the cache copy of the memory block. The processor ends, in the event that the lock flag is reset, the store conditional instruction and does not write to the cache copy of the memory block.

    摘要翻译: 一种用于在处理器中执行加载锁定和存储条件指令的方法,对存储器块实现原子读写操作。 首先执行加载锁定指令以读取存储器块,并且响应于执行加载锁定指令的处理器发出读取修改系统命令来读取块并由处理器获取块的所有权,并且还设置锁定 标记存储器块的地址,并将存储器块的值写入处理器的高速缓存作为存储器块的高速缓存副本。 如果处理器接收到存储块的任何无效消息,则锁定标志在由处理器接收到存储器块的高速缓存副本的无效消息时被重置。 处理器在接收到所有权请求消息之后处理器递交所述存储器块的所有权,等待处理器选定的时间间隔(如果在执行加载锁定指令之后由处理器接收到)。 处理器执行存储条件指令,并且处理器响应于执行存储条件指令测试锁定标志,并且如果设置了锁定标志,则写入存储器块的高速缓存副本。 处理器在锁定标志被复位的情况下结束,存储条件指令,并且不写入存储器块的高速缓存副本。

    Mechanism for optimizing generation of commit-signals in a distributed shared-memory system
    43.
    发明授权
    Mechanism for optimizing generation of commit-signals in a distributed shared-memory system 失效
    优化分布式共享内存系统中提交信号生成的机制

    公开(公告)号:US06209065B1

    公开(公告)日:2001-03-27

    申请号:US08957230

    申请日:1997-10-24

    IPC分类号: G06F1314

    CPC分类号: G06F9/542 G06F9/52

    摘要: A mechanism optimizes the generation of a commit-signal by control logic of the multiprocessor system in response to a memory reference operation issued by a processor to a local node of a multiprocessor system having a hierarchical switch for interconnecting a plurality of nodes. The mechanism generally comprises a structure that indicates whether the memory reference operation affects other processors of other nodes of the multiprocessor system. An ordering point of the local node generates an optimized commit-signal when the structure indicates that the memory reference operation does not affect the other processors.

    摘要翻译: 一种机制响应于处理器向具有用于互连多个节点的分层交换机的多处理器系统的本地节点发出的存储器参考操作来优化多处理器系统的控制逻辑的生成提交信号。 该机制通常包括指示存储器参考操作是否影响多处理器系统的其他节点的其他处理器的结构。 当结构指示存储器参考操作不影响其他处理器时,本地节点的排序点生成优化的提交信号。

    Method and apparatus for releasing victim data buffers of computer
systems by comparing a probe counter with a service counter
    44.
    发明授权
    Method and apparatus for releasing victim data buffers of computer systems by comparing a probe counter with a service counter 失效
    通过将探测计数器与服务计数器进行比较来释放计算机系统的受害者数据缓冲器的方法和装置

    公开(公告)号:US06105108A

    公开(公告)日:2000-08-15

    申请号:US957509

    申请日:1997-10-24

    IPC分类号: G06F12/08 G06F12/00 G06F13/00

    CPC分类号: G06F12/0811

    摘要: A multiprocessor computer system releases a victim data buffer storing victim data, when system control logic determines that a count of the number of probe messages pending at a specified time equals the number of such probe messages that have had an address comparison performed after the specified time. The specified time occurs when a command to write the victim data element to main memory passes a serialization point of the computer system.The address comparison compares a target address of a probe message with addresses of data stored in the victim data buffer and the associated cache of a CPU of the computer system.

    摘要翻译: 多系统计算机系统释放存储受害者数据的受害者数据缓冲器,当系统控制逻辑确定在指定时间内等待的探测消息数量的计数等于在指定时间之后进行了地址比较的探测消息的数量 。 当将受害者数据元素写入主存储器的命令通过计算机系统的序列化点时,会发生指定的时间。 地址比较将探测消息的目标地址与存储在受害者数据缓冲器中的数据的地址以及计算机系统的CPU的相关联的缓存进行比较。

    Separate victim buffer read and release control
    45.
    发明授权
    Separate victim buffer read and release control 失效
    单独的受害者缓冲区读取和释放控制

    公开(公告)号:US6101581A

    公开(公告)日:2000-08-08

    申请号:US957217

    申请日:1997-10-24

    IPC分类号: G06F12/08 G06F12/12

    CPC分类号: G06F12/0804 G06F12/0831

    摘要: In accordance with the present invention, a method and apparatus is provided for maintaining the coherency of victim data from a time when the data is stored in a victim data buffer until a time when the data is written into a main memory. Alternatively, the coherency of the victim data is preserved until a determination is made that pending probe messages do not target the victim data. At that time the victim data buffer can be deallocated.With both arrangements, a central processing unit can release a victim data buffer at a point in time other than when the data that is stored therein is read from the buffer. Thus, the central processor unit can perform the release or deallocation of the buffer when it is most efficient and when no further access to the data is required.

    摘要翻译: 根据本发明,提供一种方法和装置,用于在将数据存储在受害者数据缓冲器中直到数据被写入主存储器的时间内时保持受害者数据的一致性。 或者,保留受害者数据的一致性,直到确定未决探测消息不针对受害者数据为止。 当时可以释放受害者的数据缓冲区。 通过这两种布置,中央处理单元可以在从缓冲器读取存储在其中的数据以外的时间点释放受害者数据缓冲器。 因此,当中央处理器单元最有效并且不需要对数据的进一步访问时,中央处理器单元可以执行缓冲器的释放或释放。

    Mechanism for reducing latency of memory barrier operations on a
multiprocessor system
    46.
    发明授权
    Mechanism for reducing latency of memory barrier operations on a multiprocessor system 失效
    减少多处理器系统上存储器屏障操作延迟的机制

    公开(公告)号:US6088771A

    公开(公告)日:2000-07-11

    申请号:US957501

    申请日:1997-10-24

    IPC分类号: G06F9/45 G06F13/00

    摘要: A technique reduces the latency of a memory barrier (MB) operation used to impose an inter-reference order between sets of memory reference operations issued by a processor to a multiprocessor system having a shared memory. The technique comprises issuing the MB operation immediately after issuing a first set of memory reference operations (i.e., the pre-MB operations) without waiting for responses to those pre-MB operations. Issuance of the MB operation to the system results in serialization of that operation and generation of a MB Acknowledgment (MB-Ack) command. The MB-Ack is loaded into a probe queue of the issuing processor and, according to the invention, functions to pull-in all previously ordered invalidate and probe commands in that queue. By ensuring that the probes and invalidates are ordered before the MB-Ack is received at the issuing processor, the inventive technique provides the appearance that all pre-MB references have completed.

    摘要翻译: 一种技术减少了用于在处理器向具有共享存储器的多处理器系统发出的存储器参考操作的集合之间施加参考间顺序的存储器屏障(MB)操作的等待时间。 该技术包括在发出第一组存储器参考操作(即,预MB操作)之前立即发出MB操作,而不等待对那些MB前操作的响应。 向系统发出MB操作会导致该操作的序列化和生成MB确认(MB-Ack)命令。 MB-Ack被加载到发布处理器的探测队列中,并且根据本发明,该功能用于在该队列中引入所有先前订购的无效和探测命令。 通过确保在发布处理器接收到MB-Ack之前对探测和无效进行排序,本发明技术提供了所有pre-MB引用完成的外观。

    Technique for reducing latency of inter-reference ordering using commit
signals in a multiprocessor system having shared caches
    47.
    发明授权
    Technique for reducing latency of inter-reference ordering using commit signals in a multiprocessor system having shared caches 失效
    用于在具有共享高速缓存的多处理器系统中使用提交信号来减少参考间排序的等待时间的技术

    公开(公告)号:US6055605A

    公开(公告)日:2000-04-25

    申请号:US957544

    申请日:1997-10-24

    IPC分类号: G06F12/08 G06F13/00 G06F12/00

    CPC分类号: G06F12/084

    摘要: A technique reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory that is distributed among a plurality of processors that share a cache. According to the technique, each processor sharing a cache inherits a commit-signal that is generated by control logic of the multiprocessor system in response to a memory reference operation issued by another processor sharing that cache. The commit-signal facilitates serialization among the processors and shared memory entities of the multiprocessor system by indicating the apparent completion of the memory reference operation to those entities of the system.

    摘要翻译: 一种技术减少了具有分配在共享高速缓存的多个处理器之间的共享存储器的多处理器系统中的存储器参考操作组之间的参考间排序的等待时间。 根据该技术,共享高速缓存的每个处理器响应由共享该高速缓存的另一个处理器发出的存储器引用操作而继承由多处理器系统的控制逻辑产生的提交信号。 提交信号通过指示对系统的那些实体的存储器参考操作的明显完成来促进多处理器系统的处理器和共享存储器实体之间的串行化。

    Apparatus and method for serialized set prediction
    48.
    发明授权
    Apparatus and method for serialized set prediction 失效
    串联集预测的装置和方法

    公开(公告)号:US5966737A

    公开(公告)日:1999-10-12

    申请号:US971630

    申请日:1997-11-17

    IPC分类号: G06F12/08 G06F9/38

    CPC分类号: G06F12/0864 G06F2212/6082

    摘要: A prediction mechanism for improving direct-mapped cache performance is shown to include a direct-mapped cache, partitioned into a plurality of pseudo-banks. Prediction means are employed to provide a prediction index which is appended to the cache index to provide the entire address for addressing the direct mapped cache. One embodiment of the prediction means includes a prediction cache which is advantageously larger than the pseudo-banks of the direct-mapped cache and is used to store the prediction index for each cache location. A second embodiment includes a plurality of partial tag stores, each including a predetermined number of tag bits for the data in each bank. A comparison of the tags generates a match in one of the plurality of tag stores, and is used in turn to generate a prediction index. A third embodiment for use with a direct mapped cache divided into two partitions includes a distinguishing bit ram, which is used to provide the bit number of any bit which differs between the tags at the same location in the different banks. The bit number is used in conjunction with a complement signal to provide the prediction index for addressing the direct-mapped cache.

    摘要翻译: 示出了用于改善直接映射高速缓存性能的预测机制,其包括被划分成多个伪库的直接映射高速缓存。 预测装置被用于提供附加到高速缓存索引的预测索引,以提供用于寻址直接映射高速缓存的整个地址。 预测装置的一个实施例包括有利地大于直接映射高速缓存的伪库的预测高速缓存,并且用于存储每个高速缓存位置的预测索引。 第二实施例包括多个部分标签存储,每个部分标签存储器包括用于每个存储体中的数据的预定数量的标签位。 标签的比较在多个标签存储之一中产生匹配,并且依次用于生成预测索引。 用于分割成两个分区的直接映射高速缓存的第三实施例包括区分位RAM,其用于提供不同存储体中相同位置处的标签之间不同的任何位的位数。 位数与补码信号结合使用,以提供用于寻址直接映射高速缓存的预测索引。

    Apparatus and method for intelligent multiple-probe cache allocation
    49.
    发明授权
    Apparatus and method for intelligent multiple-probe cache allocation 失效
    智能多探头缓存分配的装置和方法

    公开(公告)号:US5829051A

    公开(公告)日:1998-10-27

    申请号:US223069

    申请日:1994-04-04

    IPC分类号: G06F12/08 G06F9/26

    CPC分类号: G06F12/0864

    摘要: An apparatus for allocating data to and retrieving data from a cache includes a memory subsystem coupled between a processor and a memory to provide quick access of memory data to the processor. The memory subsystem includes a cache memory. The address provided to the memory subsystem is divided into a cache index and a tag, and the cache index is hashed to provide a plurality of alternative addresses for accessing the cache. During a cache read, each of the alternative addresses are selected to search for the data responsive to an indicator of the validity of the data at the locations. The selection of the alternative address may be done through a mask having a number of bits corresponding to the number of alternative addresses. Each bit indicates whether the alternative address at that location should be used during the access of the cache in search of the data. Alternatively, a memory device which has more entries than the cache has blocks may be used to store the select value of the best alternative address to use to locate the data. Data is allocated to each alternative address based upon a modified least recently used technique wherein a quantum number and modula counter are used to time stamp the data.

    摘要翻译: 一种用于向高速缓存提供数据并从其中检索数据的装置包括耦合在处理器和存储器之间的存储器子系统,以便将存储器数据快速地存取到处理器。 存储器子系统包括高速缓冲存储器。 提供给存储器子系统的地址被划分为高速缓存索引和标签,并且高速缓存索引被散列以提供用于访问高速缓存的多个替代地址。 在缓存读取期间,选择每个备选地址以响应于在该位置处的数据的有效性的指示符来搜索数据。 替代地址的选择可以通过具有对应于替代地址的数量的位数的掩码来完成。 每个位指示在缓存访问期间是否应该使用该位置处的替代地址来搜索数据。 或者,具有比高速缓存具有更多条目的存储器装置可以用于存储用于定位数据的最佳替代地址的选择值。 基于修改的最近最少使用的技术将数据分配给每个备选地址,其中使用量子数和模数计数器来对数据进行时间戳。

    Memory reference tagging
    50.
    发明授权
    Memory reference tagging 失效
    内存引用标记

    公开(公告)号:US5619662A

    公开(公告)日:1997-04-08

    申请号:US289613

    申请日:1994-08-12

    IPC分类号: G06F9/30 G06F9/32 G06F9/38

    摘要: A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by the set of instructions, to reorder the issuance of the set of instructions from the instruction processor. The mapped register operand fields are associated with the corresponding instructions of the reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.

    摘要翻译: 流水线处理器包括指令盒,其包括寄存器映射器,映射一组指令的寄存器操作数字段和由该组指令馈送的指令调度器,以从指令处理器重新排序指令集的发布。 映射的寄存器操作数字段在指令发布之前与重新排序的指令集的相应指令相关联。 处理器还包括分支预测表,其将与分支指令相关联的过去历史的存储模式映射到分支指令的更可能的预测方向。 处理器还包括与指令调度器相关联的存储器参考标记存储器,使得调度器可以在不知道由存储器参考指令寻址的实际存储器位置的情况下重新排序存储器参考指令。