System on a chip bus with automatic pipeline stage insertion for timing closure
    12.
    发明授权
    System on a chip bus with automatic pipeline stage insertion for timing closure 有权
    系统具有自动流水线插入的片上总线,用于定时关闭

    公开(公告)号:US06834378B2

    公开(公告)日:2004-12-21

    申请号:US10264162

    申请日:2002-10-03

    IPC分类号: G06F945

    CPC分类号: G06F17/5045

    摘要: A method of designing a system on a chip (SoC) to operate with varying latencies and frequencies. A layout of the chip is designed with specific placement of devices, including a bus controller, initiator, and target devices. The time for a signal to propagate from a source device to a destination device is determined relative to a default propagation time. A pipeline stage is then inserted into a bus path between said source device and destination device for each additional time the signal takes to propagate. Each device (i.e., initiators, targets, and bus controller) is designed with logic to control a protocol that functions with a variety of response latencies. With the additional logic, the devices do not need to be changed when pipeline stages are inserted in the various paths. Registers are utilized as the pipeline stages that are inserted within the paths.

    摘要翻译: 一种设计芯片上的系统(SoC)以在不同的延迟和频率下工作的方法。 芯片的布局设计具有特定的器件布局,包括总线控制器,启动器和目标器件。 相对于默认传播时间确定信号从源设备传播到目的地设备的时间。 然后,在信号需要传播的每个附加时间,将流水线级插入到所述源设备和目的设备之间的总线路径中。 每个设备(即,启动器,目标和总线控制器)被设计为具有控制以各种响应延迟起作用的协议的逻辑。 使用附加逻辑,当管道级插入各种路径时,不需要更改设备。 寄存器被用作插入到路径内的流水线级。

    Reducing power in a snooping cache based multiprocessor environment
    13.
    发明授权
    Reducing power in a snooping cache based multiprocessor environment 失效
    在基于多播处理器环境的基于高速缓存的基础上降低功耗

    公开(公告)号:US06826656B2

    公开(公告)日:2004-11-30

    申请号:US10059537

    申请日:2002-01-28

    IPC分类号: G06F1208

    摘要: A method and system for reducing power in a snooping cache based environment. A memory may be coupled to a plurality of processing units via a bus. Each processing unit may comprise a cache controller coupled to a cache associated with the processing unit. The cache controller may comprise a segment register comprising N bits where each bit in the segment register may be associated with a segment of memory divided into N segments. The cache controller may be configured to snoop a requested address on the bus. Upon determining which bit in the segment register is associated with the snooped requested address, the segment register may determine if the bit associated with the snooped requested address is set. If the bit is not set, then a cache search may not be performed thereby mitigating the power consumption associated with a snooped request cache search.

    摘要翻译: 一种用于在基于窥探缓存的环境中降低功耗的方法和系统。 存储器可以经由总线耦合到多个处理单元。 每个处理单元可以包括耦合到与处理单元相关联的高速缓存器的高速缓存控制器。 高速缓存控制器可以包括包括N个比特的分段寄存器,其中分段寄存器中的每个比特可以与划分成N个分段的一段存储器相关联。 高速缓存控制器可以被配置为窥探总线上的所请求的地址。 一旦确定段寄存器中的哪个位与被窥探的请求地址相关联,则段寄存器可以确定是否设置与被窥探的请求地址相关联的位。 如果该位未设置,则可能不执行高速缓存搜索,从而减轻与窥探请求高速缓存搜索相关联的功耗。

    Representing loop branches in a branch history register with multiple bits
    14.
    发明授权
    Representing loop branches in a branch history register with multiple bits 有权
    在多个位的分支历史寄存器中表示循环分支

    公开(公告)号:US08904155B2

    公开(公告)日:2014-12-02

    申请号:US11378712

    申请日:2006-03-17

    CPC分类号: G06F9/3848

    摘要: In response to a property of a conditional branch instruction associated with a loop, such as a property indicating that the branch is a loop-ending branch, a count of the number of iterations of the loop is maintained, and a multi-bit value indicative of the loop iteration count is stored in a Branch History Register (BHR). In one embodiment, the multi-bit value may comprise the actual loop count, in which case the number of bits is variable. In another embodiment, the number of bits is fixed (e.g., two) and loop iteration counts are mapped to one of a fixed number of multi-bit values (e.g., four) by comparison to thresholds. Separate iteration counts may be maintained for nested loops, and a multi-bit value stored in the BHR may indicate a loop iteration count of only an inner loop, only the outer loop, or both.

    摘要翻译: 响应于与循环相关联的条件转移指令的属性,例如指示分支是循环结束分支的属性,维持循环的迭代次数的计数,并且指示多位值 循环迭代计数存储在分支历史记录寄存器(BHR)中。 在一个实施例中,多比特值可以包括实际循环计数,在这种情况下,比特数是可变的。 在另一个实施例中,比特数是固定的(例如,两个),并且与阈值相比较,循环迭代计数被映射到固定数量的多比特值(例如,四)中的一个。 对于嵌套循环可以保持单独的迭代计数,并且存储在BHR中的多位值可能仅表示内部循环,仅外部循环或两者的循环迭代计数。

    Predecode repair cache for instructions that cross an instruction cache line
    15.
    发明授权
    Predecode repair cache for instructions that cross an instruction cache line 有权
    Predecode修复缓存,用于跨越指令高速缓存行的指令

    公开(公告)号:US08898437B2

    公开(公告)日:2014-11-25

    申请号:US11934108

    申请日:2007-11-02

    IPC分类号: G06F9/30 G06F9/38

    摘要: A predecode repair cache is described in a processor capable of fetching and executing variable length instructions having instructions of at least two lengths which may be mixed in a program. An instruction cache is operable to store in an instruction cache line instructions having at least a first length and a second length, the second length longer than the first length. A predecoder is operable to predecode instructions fetched from the instruction cache that have invalid predecode information to form repaired predecode information. A predecode repair cache is operable to store the repaired predecode information associated with instructions of the second length that span across two cache lines in the instruction cache. Methods for filling the predecode repair cache and for executing an instruction that spans across two cache lines are also described.

    摘要翻译: 在能够获取和执行具有至少两个长度的指令的可变长度指令的处理器中描述了预代码修复高速缓存,其可以在程序中混合。 指令高速缓存用于存储指令高速缓存行指令,该指令具有至少第一长度和第二长度,第二长度长于第一长度。 预解码器可用于对具有无效预解码信息的指令高速缓存取出的指令进行预解码,以形成修复的预解码信息。 预解码修复高速缓存可操作用于存储与跨越指令高速缓存中的两个高速缓存行的第二长度的指令相关联的修复的预解码信息。 还描述了用于填充预解码修复高速缓存和用于执行跨越两个高速缓存行的指令的方法。

    Method for filtering traffic to a physically-tagged data cache
    16.
    发明授权
    Method for filtering traffic to a physically-tagged data cache 有权
    将流量过滤到物理标记的数据高速缓存的方法

    公开(公告)号:US08612690B2

    公开(公告)日:2013-12-17

    申请号:US13426647

    申请日:2012-03-22

    IPC分类号: G06F12/10 G06F7/04

    摘要: Embodiments of a data cache are disclosed that substantially decrease a number of accesses to a physically-tagged tag array of the data cache are provided. In general, the data cache includes a data array that stores data elements, a physically-tagged tag array, and a virtually-tagged tag array. In one embodiment, the virtually-tagged tag array receives a virtual address. If there is a match for the virtual address in the virtually-tagged tag array, the virtually-tagged tag array outputs, to the data array, a way stored in the virtually-tagged tag array for the virtual address. In addition, in one embodiment, the virtually-tagged tag array disables the physically-tagged tag array. Using the way output by the virtually-tagged tag array, a desired data element in the data array is addressed.

    摘要翻译: 公开了数据高速缓存的实施例,其提供了对数据高速缓存的物理标记的标签阵列的访问的数量的大量减少。 通常,数据高速缓存包括存储数据元素,物理标记的标签阵列和虚拟标记的标签阵列的数据阵列。 在一个实施例中,虚拟标记的标签阵列接收虚拟地址。 如果虚拟标记的标签阵列中的虚拟地址匹配,则虚拟标记的标签阵列将数据阵列中的虚拟地址标记在虚拟标记的数组中。 此外,在一个实施例中,虚拟标记的标签阵列禁用物理标记的标签阵列。 使用由虚拟标记的标签数组输出的方式,寻址数据数组中所需的数据元素。

    Auto-Ordering of Strongly Ordered, Device, and Exclusive Transactions Across Multiple Memory Regions
    18.
    发明申请
    Auto-Ordering of Strongly Ordered, Device, and Exclusive Transactions Across Multiple Memory Regions 有权
    在多个内存区域自动排序强顺序,设备和独占交易

    公开(公告)号:US20130151799A1

    公开(公告)日:2013-06-13

    申请号:US13315370

    申请日:2011-12-09

    IPC分类号: G06F12/00

    CPC分类号: G06F13/1621

    摘要: Efficient techniques are described for controlling ordered accesses in a weakly ordered storage system. A stream of memory requests is split into two or more streams of memory requests and a memory access counter is incremented for each memory request. A memory request requiring ordered memory accesses is identified in one of the two or more streams of memory requests. The memory request requiring ordered memory accesses is stalled upon determining a previous memory request from a different stream of memory requests is pending. The memory access counter is decremented for each memory request guaranteed to complete. A count value in the memory access counter that is different from an initialized state of the memory access counter indicates there are pending memory requests. The memory request requiring ordered memory accesses is processed upon determining there are no further pending memory requests.

    摘要翻译: 描述了用于控制弱订单存储系统中有序访问的高效技术。 存储器请求流被分成两个或更多个存储器请求流,并且每个存储器请求增加存储器访问计数器。 需要有序存储器访问的存储器请求在两个或更多个存储器请求流中的一个中被识别。 在从不同的存储器请求流确定先前的存储器请求正在等待时,需要有序存储器访问的存储器请求被停止。 对于保证完成的每个存储器请求,存储器访问计数器递减。 与存储器访问计数器的初始化状态不同的存储器访问计数器中的计数值指示存在未决存储器请求。 在确定没有进一步的未决存储器请求时,处理需要有序存储器访问的存储器请求。

    Method and a system for accelerating procedure return sequences
    19.
    发明授权
    Method and a system for accelerating procedure return sequences 有权
    方法和加速过程返回序列的系统

    公开(公告)号:US08341383B2

    公开(公告)日:2012-12-25

    申请号:US11934264

    申请日:2007-11-02

    IPC分类号: G06F9/44

    摘要: A method for retrieving a return address from a link stack when returning from a procedure in a pipeline processor is disclosed. The method identifies a retrieve instruction operable to retrieve a return address from a software stack. The method further identifies a branch instruction operable to branch to the return address. The method retrieves the return address from the link stack, in response to both the instruction and the branch instruction being identified and fetches instructions using the return address.

    摘要翻译: 公开了一种在从流水线处理器中的过程返回时从链接堆栈检索返回地址的方法。 该方法识别可操作以从软件堆栈检索返回地址的检索指令。 该方法进一步标识可分支到返回地址的分支指令。 该方法响应于正在识别的指令和分支指令,并使用返回地址获取指令,从链接堆栈检索返回地址。

    Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy
    20.
    发明申请
    Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy 有权
    在多级缓存层次结构中减少铸件的装置和方法

    公开(公告)号:US20120059995A1

    公开(公告)日:2012-03-08

    申请号:US13292651

    申请日:2011-11-09

    IPC分类号: G06F12/08

    摘要: Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.

    摘要翻译: 技术和方法用于减少从较低级别缓存中移位的高速缓存行的更高级缓存的分配。 对于在下一级高速缓存中被确定为冗余的移位高速缓存线,防止移位的高速缓存行的分配,从而减少了突发。 为此,选择在下一级缓存中移位的行。 识别与所选行相关联的信息,其指示所选择的行存在于较高级别的高速缓存中。 基于所识别的信息来防止在较高级别高速缓存中的所选行的分配。 防止所选线路的分配节省与分配相关联的功率。