Cache chain structure to implement high bandwidth low latency cache memory subsystem
    1.
    发明授权
    Cache chain structure to implement high bandwidth low latency cache memory subsystem 有权
    缓存链结构实现高带宽低延迟高速缓存存储器子系统

    公开(公告)号:US06557078B1

    公开(公告)日:2003-04-29

    申请号:US09510283

    申请日:2000-02-21

    IPC分类号: G06F1300

    摘要: The inventive cache uses a queuing structure which provides out-of-order cache memory access support for multiple accesses, as well as support for managing bank conflicts and address conflicts. The inventive cache can support four data accesses that are hits per clocks, support one access that misses the L1 cache every clock, and support one instruction access every clock. The responses are interspersed in the pipeline, so that conflicts in the queue are minimized. Non-conflicting accesses are not inhibited, however, conflicting accesses are held up until the conflict clears. The inventive cache provides out-of-order support after the retirement stage of a pipeline.

    摘要翻译: 本发明的高速缓存使用排队结构,其为多个访问提供无序高速缓存存储器访问支持,以及用于管理银行冲突和地址冲突的支持。 本发明的高速缓存可以支持每个时钟命中的四个数据访问,支持每个时钟丢失L1缓存的一个访问,并且每个时钟支持一个指令访问。 响应散布在流水线中,从而使队列中的冲突最小化。 不冲突的访问不被禁止,但冲突的冲突消除之后,冲突的访问将被阻止。 本发明的缓存在管道的退役阶段之后提供无序支持。

    Cache address conflict mechanism without store buffers
    2.
    发明授权
    Cache address conflict mechanism without store buffers 有权
    缓存地址冲突机制没有存储缓冲区

    公开(公告)号:US06539457B1

    公开(公告)日:2003-03-25

    申请号:US09510279

    申请日:2000-02-21

    IPC分类号: G06F1200

    CPC分类号: G06F12/0897

    摘要: The inventive cache manages address conflicts and maintains program order without using a store buffer. The cache utilizes an issue algorithm to insure that accesses issued in the same clock are actually issued in an order that is consistent with program order. This is enabled by performing address comparisons prior to insertion of the accesses into the queue. Additionally, when accesses are separated by one or more clocks, address comparisons are performed, and accesses that would get data from the cache memory array before a prior update has actually updated the cache memory in the array are canceled. This provides a guarantee that program order is maintained, as an access is not allowed to complete until it is assured that the most recent data will be received upon access of the array.

    摘要翻译: 本发明的缓存管理地址冲突并维护程序顺序而不使用存储缓冲器。 缓存利用问题算法来确保在同一时钟内发出的访问实际上是按照与程序顺序一致的顺序发出的。 这可以通过在将访问插入队列之前执行地址比较来实现。 此外,当访问被一个或多个时钟分开时,执行地址比较,并且取消在先前更新之前从高速缓存存储器阵列获取数据实际更新数组中的高速缓冲存储器的访问。 这提供了保证程序顺序的保证,因为访问不允许完成,直到确保在数组访问时将接收到最新的数据。

    Distributed MUX scheme for bi-endian rotator circuit
    3.
    发明授权
    Distributed MUX scheme for bi-endian rotator circuit 失效
    双端旋转电路的分布式MUX方案

    公开(公告)号:US06687262B1

    公开(公告)日:2004-02-03

    申请号:US09510280

    申请日:2000-02-21

    IPC分类号: H04J300

    摘要: The inventive control logic provides the selection signals for a bi-endian rotator MUX. The logic determines the starting point for the data transfer by determining which input register byte is going to Byte 0 of the output register. The control logic passes the starting point to single decoder. The decoded value is then sent to a plurality of MUXs, one for each of the output register bytes. Each of the MUXs is prewired to receive a portion of bits of the decoded value, and the portion is arranged in a particular order. The MUXs then send their respective outputs to the rotator MUX as selection control signals.

    摘要翻译: 本发明的控制逻辑为双端旋转器MUX提供选择信号。 该逻辑通过确定哪个输入寄存器字节将转到输出寄存器的字节0来确定数据传输的起始点。 控制逻辑将起始点传递到单个解码器。 然后将解码的值发送到多个MUX,每个MUX为每个输出寄存器字节。 每个MUX被预接线以接收解码值的一部分位,并且该部分以特定顺序排列。 然后,多路复用器将其各自的输出发送到旋转器MUX作为选择控制信号。

    Efficient combined array for 2n bit n bit multiplications
    4.
    发明授权
    Efficient combined array for 2n bit n bit multiplications 失效
    用于2n位n位乘法的高效组合阵列

    公开(公告)号:US5880985A

    公开(公告)日:1999-03-09

    申请号:US735058

    申请日:1996-10-18

    IPC分类号: G06F7/52

    CPC分类号: G06F7/5338 G06F2207/382

    摘要: In order to multiply operands of different binary lengths using a common combined array, for example to do both 8 bit by 8 bit and 16 bit by 16 bit multiplications, 2.sup.m-1 multiplications are performed, where m is equal to the number of different bit lengths it is desired to multiply. For example, where 8.times.8 bit and 16.times.16 bit multiplications are done, 2 different multiplications are done. Each multiplication is an n.times.n/2.sup.m-1 multiplication, e.g., a 16.times.8 bit multiplication. Sign correction is performed by adding a correction vector or by modifying one of the partial products. The results of the multiplications are added together to obtain a 2 n bit result. Groups of bits from said 2 n result are selected depending on the length of the operands being multiplied.

    摘要翻译: 为了使用公共组合阵列对不同二进制长度的操作数进行乘法运算,例如进行8位乘8位和16位乘16位乘法运算,执行2m-1乘法,其中m等于不同位数 希望乘以长度。 例如,在完成8×8位和16×16位乘法的情况下,完成2次不同的乘法。 每个乘法是nxn / 2m-1乘法,例如16×8位乘法。 通过添加校正矢量或通过修改部分乘积之一来执行符号校正。 将乘法的结果相加在一起以获得2 n位结果。 来自所述2n结果的比特组根据被乘以的操作数的长度来选择。