EVICT ON WRITE, A MANAGEMENT STRATEGY FOR A PREFETCH UNIT AND/OR FIRST LEVEL CACHE IN A MULTIPROCESSOR SYSTEM WITH SPECULATIVE EXECUTION
    81.
    发明申请
    EVICT ON WRITE, A MANAGEMENT STRATEGY FOR A PREFETCH UNIT AND/OR FIRST LEVEL CACHE IN A MULTIPROCESSOR SYSTEM WITH SPECULATIVE EXECUTION 审中-公开
    在具有执行执行的多处理器系统中的写入命令,预设单元和/或第一级缓存的管理策略

    公开(公告)号:US20150006821A1

    公开(公告)日:2015-01-01

    申请号:US14486413

    申请日:2014-09-15

    IPC分类号: G06F12/08

    摘要: In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.

    摘要翻译: 在具有至少两级高速缓存的多处理器系统中,推测线程可以与其他线程并行运行在核心处理器上。 当线程寻求对主存储器进行写操作时,该访问将通过第一级缓存写入第二级缓存。 在写入之后,从第一级高速缓存和/或预取单元删除相应的行,使得必须从第二级高速缓存检索对主存储器中相同位置的任何进一步访问。 第二级缓存跟踪多个版本的数据,其中多个推测线程并行运行,而第一级缓存在推测期间没有任何版本。 开关允许在推测盲目一级缓存的操作模式之间进行选择。

    COHERENCE PROCESSING WITH PRE-KILL MECHANISM TO AVOID DUPLICATED TRANSACTION IDENTIFIERS
    82.
    发明申请
    COHERENCE PROCESSING WITH PRE-KILL MECHANISM TO AVOID DUPLICATED TRANSACTION IDENTIFIERS 有权
    预防机制的协调处理避免了重复交易标识符

    公开(公告)号:US20140310469A1

    公开(公告)日:2014-10-16

    申请号:US13860885

    申请日:2013-04-11

    申请人: APPLE INC.

    IPC分类号: G06F12/08

    摘要: An apparatus for processing coherency transactions in a computing system is disclosed. The apparatus may include a request queue circuit, a duplicate tag circuit, and a memory interface unit. The request queue circuit may be configured to generate a speculative read request dependent upon a received read transaction. The duplicate tag circuit may be configured to store copies of tag from one or more cache memories, and to generate a kill message in response to a determination that data requested in the received read transaction is stored in a cache memory. The memory interface unit may be configured to store the generated speculative read request dependent upon a stall condition. The stored speculative read request may be sent to a memory controller dependent upon the stall condition. The memory interface unit may be further configured to delete the speculative read request in response to the kill message.

    摘要翻译: 公开了一种用于处理计算系统中的一致性事务的装置。 该装置可以包括请求队列电路,复制标签电路和存储器接口单元。 请求队列电路可以被配置为根据所接收的读取事务来生成推测性读取请求。 重复标签电路可以被配置为存储来自一个或多个高速缓冲存储器的标签的副本,并且响应于在所接收的读事务中请求的数据被存储在高速缓冲存储器中的确定来生成杀死消息。 存储器接口单元可以被配置为根据失速条件来存储产生的推测性读取请求。 存储的推测性读取请求可以根据失速条件发送到存储器控制器。 存储器接口单元还可以被配置为响应于杀死消息来删除推测性读取请求。

    METHODS AND SYSTEMS FOR REDUCING THE AMOUNT OF TIME AND COMPUTING RESOURCES THAT ARE REQUIRED TO PERFORM A HARDWARE TABLE WALK (HWTW)
    84.
    发明申请
    METHODS AND SYSTEMS FOR REDUCING THE AMOUNT OF TIME AND COMPUTING RESOURCES THAT ARE REQUIRED TO PERFORM A HARDWARE TABLE WALK (HWTW) 有权
    用于减少时间和计算资源的方法和系统,需要执行硬件桌面(HWTW)

    公开(公告)号:US20140258586A1

    公开(公告)日:2014-09-11

    申请号:US13785877

    申请日:2013-03-05

    IPC分类号: G06F12/10

    摘要: A computer system and a method are provided that reduce the amount of time and computing resources that are required to perform a hardware table walk (HWTW) in the event that a translation lookaside buffer (TLB) miss occurs. If a TLB miss occurs when performing a stage 2 (S2) HWTW to find the PA at which a stage 1 (S1) page table is stored, the MMU uses the IPA to predict the corresponding PA, thereby avoiding the need to perform any of the S2 table lookups. This greatly reduces the number of lookups that need to be performed when performing these types of HWTW read transactions, which greatly reduces processing overhead and performance penalties associated with performing these types of transactions.

    摘要翻译: 提供一种计算机系统和方法,其在发生翻译后备缓冲器(TLB)未命中的情况下减少执行硬件表行走(HWTW)所需的时间量和计算资源。 如果执行阶段2(S2)HWTW以找到存储第1(S1)页表的PA时发生TLB未命中,则MMU使用IPA预测相应的PA,从而避免执行任何 S2表查找。 这大大减少了执行这些类型的HWTW读取事务时需要执行的查找次数,这大大降低了与执行这些类型的事务相关联的处理开销和性能损失。

    Snoop filter for filtering snoop requests
    85.
    发明授权
    Snoop filter for filtering snoop requests 有权
    用于过滤窥探请求的Snoop过滤器

    公开(公告)号:US08677073B2

    公开(公告)日:2014-03-18

    申请号:US13587420

    申请日:2012-08-16

    IPC分类号: G06F13/28 G06F12/00

    摘要: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

    摘要翻译: 一种用于在具有多个处理单元的多处理器计算环境中支持高速缓存一致性的方法和装置,每个处理单元具有与其相关联并与之可操作地相连的一个或多个本地高速缓冲存储器。 该方法包括提供与每个处理单元相关联的窥探过滤器设备,每个窥探过滤器设备具有多个专用输入端口,用于从多处理器计算环境中的专用存储器写入源接收窥探请求。 每个窥探过滤器装置包括与多个专用输入端口相对应的多个并行操作端口窥探滤波器,每个端口窥探滤波器实现一个或多个并行操作子滤波器元件,其适于同时滤除从相应专用存储器接收的窥探请求 写入源并将这些请求的子集转发到其相关联的处理单元。

    Data transfer to memory over an input/output (I/O) interconnect
    86.
    发明授权
    Data transfer to memory over an input/output (I/O) interconnect 有权
    通过输入/输出(I / O)互连将数据传输到存储器

    公开(公告)号:US08510509B2

    公开(公告)日:2013-08-13

    申请号:US11958418

    申请日:2007-12-18

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    摘要: A method, system, and computer program product for data transfer to memory over an input/output (I/O) interconnect are provided. The method includes reading a mailbox stored on an I/O adapter in response to a request to initiate an I/O transaction. The mailbox stores a directive that defines a condition under which cache injection for data values in the I/O transaction will not be performed. The method also includes embedding a hint into the I/O transaction when the directive in the mailbox matches data received in the request, and executing the I/O transaction. The execution of the I/O transaction causes a system chipset or I/O hub for a processor receiving the I/O transaction, to directly store the data values from the I/O transaction into system memory and to suppress the cache injection of the data values into a cache memory upon presence of the hint in a header of the I/O transaction.

    摘要翻译: 提供了一种用于通过输入/输出(I / O)互连将数据传输到存储器的方法,系统和计算机程序产品。 该方法包括响应于发起I / O事务的请求读取存储在I / O适配器上的邮箱。 邮箱存储一个指令,定义了一个条件,在该条件下,不会执行I / O事务中的数据值的高速缓存注入。 该方法还包括在邮箱中的指令与请求中接收的数据匹配并执行I / O事务时,将提示嵌入到I / O事务中。 执行I / O事务导致接收I / O事务的处理器的系统芯片组或I / O集线器将来自I / O事务的数据值直接存储到系统存储器中,并且抑制高速缓存注入 在I / O事务的头部存在提示时,数据值进入高速缓冲存储器。

    Probe speculative address file
    87.
    发明授权
    Probe speculative address file 失效
    探测推测地址文件

    公开(公告)号:US08438335B2

    公开(公告)日:2013-05-07

    申请号:US12892476

    申请日:2010-09-28

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0815 G06F2212/507

    摘要: An apparatus to resolve cache coherency is presented. In one embodiment, the apparatus includes a microprocessor comprising one or more processing cores. The apparatus also includes a probe speculative address file unit, coupled to a cache memory, comprising a plurality of entries. Each entry includes a timer and a tag associated with a memory line. The apparatus further includes control logic to determine whether to service an incoming probe based at least in part on a timer value.

    摘要翻译: 提出了一种解决高速缓存一致性的设备。 在一个实施例中,该装置包括具有一个或多个处理核心的微处理器。 该装置还包括耦合到高速缓冲存储器的探测推测地址文件单元,包括多个条目。 每个条目包括定时器和与存储器线相关联的标签。 该装置还包括至少部分地基于定时器值来确定是否对入站探测器进行服务的控制逻辑。

    Apparatus and method for handling data in a cache
    88.
    发明授权
    Apparatus and method for handling data in a cache 有权
    用于处理缓存中的数据的装置和方法

    公开(公告)号:US08375170B2

    公开(公告)日:2013-02-12

    申请号:US12656709

    申请日:2010-02-12

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    摘要: A data processing apparatus for forming a portion of a coherent cache system comprises at least one master device for performing data processing operations, and a cache coupled to the at least one master device and arranged to store data values for access by that at least one master device when performing the data processing operations. Cache coherency circuitry is responsive to a coherency request from another portion of the coherent cache system to cause a coherency action to be taken in respect of at least one data value stored in the cache. Responsive to an indication that the coherency action has resulted in invalidation of that at least one data value in the cache, refetch control circuitry is used to initiate a refetch of that at least one data value into the cache. Such a mechanism causes the refetch of data into the cache to be triggered by the coherency action performed in response to a coherency request from another portion of the coherent cache system, rather than relying on any actions taken by the at least one master device, thereby providing a very flexible and efficient mechanism for reducing cache latency in a coherent cache system.

    摘要翻译: 用于形成相干高速缓存系统的一部分的数据处理设备包括用于执行数据处理操作的至少一个主设备和耦合到该至少一个主设备的高速缓存,并且被配置为存储由该至少一个主站访问的数据值 设备执行数据处理操作。 高速缓存一致性电路响应来自相干高速缓存系统的另一部分的一致性请求,以引起关于存储在高速缓存中的至少一个数据值的一致性动作。 响应于一致性动作导致高速缓存中至少一个数据值无效的指示,使用重新读取控制电路来发起将该至少一个数据值重新读取到高速缓存中。 这种机制导致数据重新取入缓存以由响应于来自相干高速缓存系统的另一部分的一致性请求而执行的一致性动作来触发,而不是依赖于由至少一个主设备采取的任何动作,从而 提供了一种非常灵活和有效的机制来减少一致的缓存系统中的缓存延迟。

    Adaptive mechanisms and methods for supplying volatile data copies in multiprocessor systems
    89.
    发明授权
    Adaptive mechanisms and methods for supplying volatile data copies in multiprocessor systems 有权
    用于在多处理器系统中提供易失性数据副本的自适应机制和方法

    公开(公告)号:US08131938B2

    公开(公告)日:2012-03-06

    申请号:US12248209

    申请日:2008-10-09

    IPC分类号: G06F12/00

    摘要: In a computer system with a memory hierarchy, when a high-level cache supplies a data copy to a low-level cache, the shared copy can be either volatile or non-volatile. When the data copy is later replaced from the low-level cache, if the data copy is non-volatile, it needs to be written back to the high-level cache; otherwise it can be simply flushed from the low-level cache. The high-level cache can employ a volatile-prediction mechanism that adaptively determines whether a volatile copy or a non-volatile copy should be supplied when the high-level cache needs to send data to the low-level cache. An exemplary volatile-prediction mechanism suggests use of a non-volatile copy if the cache line has been accessed consecutively by the low-level cache. Further, the low-level cache can employ a volatile-promotion mechanism that adaptively changes a data copy from volatile to non-volatile according to some promotion policy, or changes a data copy from non-volatile to volatile according to some demotion policy.

    摘要翻译: 在具有存储器层次结构的计算机系统中,当高级缓存将数据拷贝提供给低级缓存时,共享副本可以是易失性的或非易失性的。 当数据拷贝稍后从低级缓存中替换时,如果数据拷贝是非易失性的,则需要将其写回高级缓存; 否则可以从低级缓存中简单地刷新。 高级缓存可以采用易失性预测机制,其自动地确定当高级缓存需要向低级缓存发送数据时是否应提供易失性拷贝或非易失性拷贝。 示例性的易失性预测机制建议如果高速缓存行已被低级缓存连续访问,则使用非易失性拷贝。 此外,低级缓存可以使用根据某些促销策略自动地将数据拷贝从易失性地改变为非易失性的易失性促进机制,或者根据某种降级策略将数据拷贝从非易失性变为不稳定。

    Updating partial cache lines in a data processing system
    90.
    发明授权
    Updating partial cache lines in a data processing system 有权
    更新数据处理系统中的部分缓存行

    公开(公告)号:US08117390B2

    公开(公告)日:2012-02-14

    申请号:US12424434

    申请日:2009-04-15

    IPC分类号: G06F13/00

    摘要: A processing unit for a data processing system includes a processor core having one or more execution units for processing instructions and a register file for storing data accessed in processing of the instructions. The processing unit also includes a multi-level cache hierarchy coupled to and supporting the processor core. The multi-level cache hierarchy includes at least one upper level of cache memory having a lower access latency and at least one lower level of cache memory having a higher access latency. The lower level of cache memory, responsive to receipt of a memory access request that hits only a partial cache line in the lower level cache memory, sources the partial cache line to the at least one upper level cache memory to service the memory access request. The at least one upper level cache memory services the memory access request without caching the partial cache line.

    摘要翻译: 用于数据处理系统的处理单元包括具有一个或多个用于处理指令的执行单元的处理器核心和用于存储在指令处理中访问的数据的寄存器文件。 处理单元还包括耦合到并支持处理器核的多级高速缓存层级。 多级高速缓存层级包括具有较低访问延迟的至少一个高级缓存存储器和具有较高访问延迟的至少一个较低级别的高速缓存存储器。 响应于仅接收低级高速缓冲存储器中的部分高速缓存行的存储器访问请求的响应,较低级别的高速缓存存储器将部分高速缓存行源送至至少一个上级高速缓冲存储器来服务存储器访问请求。 至少一个上级缓存存储器服务于存储器访问请求,而不缓存部分高速缓存行。