PROCESSOR PERFORMANCE IMPROVEMENT FOR INSTRUCTION SEQUENCES THAT INCLUDE BARRIER INSTRUCTIONS
    81.
    发明申请
    PROCESSOR PERFORMANCE IMPROVEMENT FOR INSTRUCTION SEQUENCES THAT INCLUDE BARRIER INSTRUCTIONS 审中-公开
    包括障碍指示的指令序列的处理器性能改进

    公开(公告)号:US20130205121A1

    公开(公告)日:2013-08-08

    申请号:US13687306

    申请日:2012-11-28

    Abstract: A technique for processing an instruction sequence that includes a barrier instruction, a load instruction preceding the barrier instruction, and a subsequent memory access instruction following the barrier instruction includes determining, by a processor core, that the load instruction is resolved based upon receipt by the processor core of an earliest of a good combined response for a read operation corresponding to the load instruction and data for the load instruction. The technique also includes if execution of the subsequent memory access instruction is not initiated prior to completion of the barrier instruction, initiating by the processor core, in response to determining the barrier instruction completed, execution of the subsequent memory access instruction. The technique further includes if execution of the subsequent memory access instruction is initiated prior to completion of the barrier instruction, discontinuing by the processor core, in response to determining the barrier instruction completed, tracking of the subsequent memory access instruction with respect to invalidation.

    Abstract translation: 一种用于处理指示序列的技术,该指令序列包括屏障指令,屏障指令之前的加载指令以及跟随障碍指令的后续存储器访问指令,包括:由处理器核心确定加载指令是基于 处理器核心最早是对应于加载指令的读取操作和用于加载指令的数据的良好组合响应。 该技术还包括如果在完成屏障指令之前没有启动后续存储器访问指令的执行,则响应于确定完成的屏障指令,由处理器核心启动后续存储器访问指令的执行。 该技术还包括如果在完成屏障指令之前启动后续存储器访问指令的执行,则响应于确定所完成的屏障指令,处理器核心中断,跟踪关于无效的后续存储器访问指令。

    SPECULATIVE DELIVERY OF DATA FROM A LOWER LEVEL OF A MEMORY HIERARCHY IN A DATA PROCESSING SYSTEM

    公开(公告)号:US20230042778A1

    公开(公告)日:2023-02-09

    申请号:US17394136

    申请日:2021-08-04

    Abstract: A multiprocessor data processing system includes multiple vertical cache hierarchies supporting a plurality of processor cores, a system memory, and an interconnect fabric coupled to the system memory and the multiple vertical cache hierarchies. Based on a request of a requesting processor core among the plurality of processor cores, a master in the multiprocessor data processing system issues, via the interconnect fabric, a read-type memory access request. The master receives via the interconnect fabric at least one beat of conditional data issued speculatively on the interconnect fabric by a controller of the system memory prior to receipt by the controller of a systemwide coherence response for the read-type memory access request. The master forwards the at least one beat of conditional data to the requesting processor core.

    TRANSLATION ENTRY INVALIDATION IN A MULTITHREADED DATA PROCESSING SYSTEM

    公开(公告)号:US20200183853A1

    公开(公告)日:2020-06-11

    申请号:US16216705

    申请日:2018-12-11

    Abstract: A multiprocessor data processing system includes a processor core having a translation structure for buffering a plurality of translation entries. The processor core receives a sequence of a plurality of translation invalidation requests. In response to receipt of each of the plurality of translation invalidation requests, the processor core determines that each of the plurality of translation invalidation requests indicates that it does not require draining of memory referent instructions for which address translation has been performed by reference to a respective one of a plurality of translation entries to be invalidated. Based on the determination, the processor core invalidates the plurality of translation entries in the translation structure without regard to draining from the processor core of memory access requests for which address translation was performed by reference to the plurality of translation entries.

    SELECTIVELY PREVENTING PRE-COHERENCE POINT READS IN A CACHE HIERARCHY TO REDUCE BARRIER OVERHEAD

    公开(公告)号:US20200174931A1

    公开(公告)日:2020-06-04

    申请号:US16209604

    申请日:2018-12-04

    Abstract: A data processing system includes a processor core having a shared store-through upper level cache and a store-in lower level cache. The processor core executes a plurality of simultaneous hardware threads of execution including at least a first thread and a second thread, and the shared store-through upper level cache stores a first cache line accessible to both the first thread and the second thread. The processor core executes in the first thread a store instruction that generates a store request specifying a target address of a storage location corresponding to the first cache line. Based on the target address hitting in the shared store-through upper level cache, the first cache line is temporarily marked, in the shared store-through upper level cache, as private to the first thread, such that any memory access request by the second thread targeting the storage location will miss in the shared store-through upper level cache.

    SYNCHRONIZED ACCESS TO DATA IN SHARED MEMORY BY RESOLVING CONFLICTING ACCESSES BY CO-LOCATED HARDWARE THREADS

    公开(公告)号:US20200150960A1

    公开(公告)日:2020-05-14

    申请号:US16184522

    申请日:2018-11-08

    Abstract: A processing unit for a data processing system includes a cache memory having reservation logic and a processor core coupled to the cache memory. The processor includes an execution unit that executes instructions in a plurality of concurrent hardware threads of execution including at least first and second hardware threads. The instructions include, within the first hardware thread, a first load-reserve instruction that identifies a target address for which a reservation is requested. The processor core additionally includes a load unit that records the target address of the first load-reserve instruction and that, responsive to detecting, in the second hardware thread, a second load-reserve instruction identifying the target address recorded by the load unit, blocks the second load-reserve instruction from establishing a reservation for the target address in the reservation logic.

    ENSURING FORWARD PROGRESS FOR NESTED TRANSLATIONS IN A MEMORY MANAGEMENT UNIT

    公开(公告)号:US20190065398A1

    公开(公告)日:2019-02-28

    申请号:US15683615

    申请日:2017-08-22

    Abstract: Ensuring forward progress for nested translations in a memory management unit (MMU) including receiving a plurality of nested translation requests, wherein each of the plurality of nested translation requests requires at least one congruence class lock; detecting, using a congruence class scoreboard, a collision of the plurality of nested translation requests based on the required congruence class locks; quiescing, in response to detecting the collision of the plurality of nested translation requests, a translation pipeline in the MMU including switching operation of the translation pipeline from a multi-thread mode to a single-thread mode and marking a first subset of the plurality of nested translation requests as high-priority nested translation requests; and servicing the high-priority nested translation requests through the translation pipeline in the single-thread mode.

Patent Agency Ranking