Method and apparatus for cutting senior store latency using store prefetching
    4.
    发明授权
    Method and apparatus for cutting senior store latency using store prefetching 有权
    使用存储预取来切割高级存储延迟的方法和装置

    公开(公告)号:US09405545B2

    公开(公告)日:2016-08-02

    申请号:US13993508

    申请日:2011-12-30

    IPC分类号: G06F12/08 G06F9/38

    摘要: In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for cutting senior store latency using store prefetching. For example, in one embodiment, such means may include an integrated circuit or an out of order processor means that processes out of order instructions and enforces in-order requirements for a cache. Such an integrated circuit or out of order processor means further includes means for receiving a store instruction; means for performing address generation and translation for the store instruction to calculate a physical address of the memory to be accessed by the store instruction; and means for executing a pre-fetch for a cache line based on the store instruction and the calculated physical address before the store instruction retires.

    摘要翻译: 根据本文公开的实施例,提供了使用商店预取来切割高级商店延迟的方法,系统,机制,技术和装置。 例如,在一个实施例中,这种装置可以包括集成电路或乱序处理器装置,其处理不一致的指令并对高速缓存执行按顺序的要求。 这样的集成电路或不按顺序的处理器装置还包括用于接收存储指令的装置; 用于执行所述存储指令的地址生成和转换以计算由所述存储指令访问的存储器的物理地址的装置; 以及用于在存储指令退出之前基于所述存储指令和所计算的物理地址来执行用于高速缓存行的预取的装置。

    Method and system to reduce the power consumption of a memory device
    6.
    发明授权
    Method and system to reduce the power consumption of a memory device 有权
    降低存储器件功耗的方法和系统

    公开(公告)号:US08352683B2

    公开(公告)日:2013-01-08

    申请号:US12823047

    申请日:2010-06-24

    摘要: A method and system to reduce the power consumption of a memory device. In one embodiment of the invention, the memory device is a N-way set-associative level one (L1) cache memory and there is logic coupled with the data cache memory to facilitate access to only part of the N-ways of the N-way set-associative L1 cache memory in response to a load instruction or a store instruction. By reducing the number of ways to access the N-way set-associative L1 cache memory for each load or store request, the power requirements of the N-way set-associative L1 cache memory is reduced in one embodiment of the invention. In one embodiment of the invention, when a prediction is made that the accesses to cache memory only requires the data arrays of the N-way set-associative L1 cache memory, the access to the fill buffers are deactivated or disabled.

    摘要翻译: 一种降低存储器件功耗的方法和系统。 在本发明的一个实施例中,存储器件是N路组合关联级(L1)高速缓冲存储器,并且存在与数据高速缓冲存储器耦合的逻辑,以便于仅访问N- 响应于加载指令或存储指令,单向设置关联L1高速缓冲存储器。 通过减少针对每个加载或存储请求访问N路组合关联的L1高速缓冲存储器的方法的数量,在本发明的一个实施例中,减少了N路组合关联的L1高速缓冲存储器的功率需求。 在本发明的一个实施例中,当预测到对高速缓存存储器的访问仅需要N路组关联的L1高速缓冲存储器的数据阵列时,对填充缓冲器的访问被去激活或禁用。

    METHOD AND APPARATUS FOR CUTTING SENIOR STORE LATENCY USING STORE PREFETCHING
    7.
    发明申请
    METHOD AND APPARATUS FOR CUTTING SENIOR STORE LATENCY USING STORE PREFETCHING 有权
    使用商店预购切割高级商店的方法和装置

    公开(公告)号:US20140223105A1

    公开(公告)日:2014-08-07

    申请号:US13993508

    申请日:2011-12-30

    IPC分类号: G06F9/38 G06F12/08

    摘要: In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for cutting senior store latency using store prefetching. For example, in one embodiment, such means may include an integrated circuit or an out of order processor means that processes out of order instructions and enforces in-order requirements for a cache. Such an integrated circuit or out of order processor means further includes means for receiving a store instruction; means for performing address generation and translation for the store instruction to calculate a physical address of the memory to be accessed by the store instruction; and means for executing a pre-fetch for a cache line based on the store instruction and the calculated physical address before the store instruction retires.

    摘要翻译: 根据本文公开的实施例,提供了使用商店预取来切割高级商店延迟的方法,系统,机制,技术和装置。 例如,在一个实施例中,这种装置可以包括集成电路或乱序处理器装置,其处理不一致的指令并对高速缓存执行按顺序的要求。 这样的集成电路或不按顺序的处理器装置还包括用于接收存储指令的装置; 用于执行所述存储指令的地址生成和转换以计算由所述存储指令访问的存储器的物理地址的装置; 以及用于在存储指令退出之前基于所述存储指令和所计算的物理地址来执行用于高速缓存行的预取的装置。

    EXTENDING CACHE COHERENCY PROTOCOLS TO SUPPORT LOCALLY BUFFERED DATA
    9.
    发明申请
    EXTENDING CACHE COHERENCY PROTOCOLS TO SUPPORT LOCALLY BUFFERED DATA 有权
    扩展缓存协议来支持本地缓存数据

    公开(公告)号:US20100169581A1

    公开(公告)日:2010-07-01

    申请号:US12346543

    申请日:2008-12-30

    IPC分类号: G06F12/08 G06F12/00

    摘要: A method and apparatus for extending cache coherency to hold buffered data to support transactional execution is herein described. A transactional store operation referencing an address associated with a data item is performed in a buffered manner. Here, the coherency state associated with cache lines to hold the data item are transitioned to a buffered state. In response to local requests for the buffered data item, the data item is provided to ensure internal transactional sequential ordering. However, in response to external access requests, a miss response is provided to ensure the transactionally updated data item is not made globally visible until commit. Upon commit, the buffered lines are transitioned to a modified state to make the data item globally visible.

    摘要翻译: 这里描述了用于扩展高速缓存一致性以保存缓冲数据以支持事务执行的方法和装置。 以缓冲的方式执行引用与数据项相关联的地址的事务存储操作。 这里,与保存数据项的高速缓存行相关联的一致性状态被转换到缓冲状态。 响应缓冲数据项的本地请求,提供数据项以确保内部事务顺序排序。 然而,响应于外部访问请求,提供了错误响应以确保事务更新的数据项在提交之前不会被全局可见。 一旦提交,缓存的行将转换到修改状态,使数据项全局可见。