TRACKING SPECULATIVE DATA CACHING
    11.
    发明申请

    公开(公告)号:US20210026641A1

    公开(公告)日:2021-01-28

    申请号:US17043963

    申请日:2019-03-21

    Applicant: Arm Limited

    Abstract: An apparatus and method of operating a data processing apparatus are disclosed. The apparatus comprises data processing circuitry to perform data processing operations in response to a sequence of instructions, wherein the data processing circuitry is capable of performing speculative execution of at least some of the sequence of instructions. A cache structure comprising entries stores temporary copies of data items which are subjected to the data processing operations and speculative execution tracking circuitry monitors correctness of the speculative execution and responsive to indication of incorrect speculative execution to cause entries in the cache structure allocated by the incorrect speculative execution to be evicted from the cache structure.

    AN APPARATUS AND METHOD FOR PREFETCHING DATA ITEMS

    公开(公告)号:US20210019148A1

    公开(公告)日:2021-01-21

    申请号:US17041312

    申请日:2019-03-14

    Applicant: Arm Limited

    Abstract: Examples of the present disclosure relate to an apparatus comprising execution circuitry to execute instructions defining data processing operations on data items. The apparatus comprises cache storage to store temporary copies of the data items. The apparatus comprises prefetching circuitry to a) predict that a data item will be subject to the data processing operations by the execution circuitry by determining that the data item is consistent with an extrapolation of previous data item retrieval by the execution circuitry, and identifying that at least one control flow element of the instructions indicates that the data item will be subject to the data processing operations by the execution circuitry; and b) prefetch the data item into the cache storage.

    MODE SWITCHING IN DEPENDENCE UPON A NUMBER OF ACTIVE THREADS
    13.
    发明申请
    MODE SWITCHING IN DEPENDENCE UPON A NUMBER OF ACTIVE THREADS 审中-公开
    根据多个活动螺纹的模式切换

    公开(公告)号:US20160357565A1

    公开(公告)日:2016-12-08

    申请号:US15133329

    申请日:2016-04-20

    Applicant: ARM LIMITED

    Abstract: Apparatus for processing data 2 is provided with fetch circuitry 16 for fetching program instructions for execution from one or more active threads of instructions having respective program counter values. Pipeline circuitry 22, 24 has a first operating mode and a second operating mode. Mode switching circuitry 30 switches the pipeline circuitry 22, 24, between the first operating mode and the second operating mode in dependence upon a number of active threads of program instructions having program instructions available to be executed. The first operating mode has a lower average energy consumption per instruction executed than the second operating mode and the second operating mode has a higher average rate of instruction execution for a single thread than the first operating mode. The first operating mode may utilise a barrel processing pipeline 22 to perform interleaved multiple thread processing. The second operating mode may utilise an out-of-order processing pipeline 24 for performing out-of-order processing.

    Abstract translation: 用于处理数据2的装置具有取出电路16,用于从具有相应程序计数器值的指令的一个或多个有效线程获取用于执行的程序指令。 管道电路22,24具有第一操作模式和第二操作模式。 模式切换电路30根据具有可执行程序指令的程序指令的有效线程数,在第一操作模式和第二操作模式之间切换流水线电路22,24。 第一操作模式具有比第二操作模式执行的每个指令更低的平均能量消耗,并且第二操作模式对于单线程具有比第一操作模式更高的平均指令执行速率。 第一操作模式可以利用桶处理流水线22执行交错多线程处理。 第二操作模式可以利用无序处理流水线24来执行无序处理。

    DATA PROCESSING APPARATUS AND METHOD FOR PRE-DECODING INSTRUCTIONS TO BE EXECUTED BY PROCESSING CIRCUITRY
    14.
    发明申请
    DATA PROCESSING APPARATUS AND METHOD FOR PRE-DECODING INSTRUCTIONS TO BE EXECUTED BY PROCESSING CIRCUITRY 有权
    数据处理装置和预处理指令执行电路的预处理方法

    公开(公告)号:US20140317384A1

    公开(公告)日:2014-10-23

    申请号:US13868186

    申请日:2013-04-23

    Applicant: ARM Limited

    CPC classification number: G06F9/3802 G06F9/382 G06F12/0811

    Abstract: A hierarchical cache with at least a unified cache is used to store both instructions and data values, and a further cache coupled between processing circuitry and a unified cache. The unified cache has a plurality of cache lines identified as an instruction cache line or a data cache line. Each data cache line stores at least one data value and the associated information. Pre-decode circuitry is associated with the unified cache and performs a first pre-decode operation on a received instruction for that instruction cache line in order to generate a corresponding partially pre-decoded instruction for storing in the instruction cache line. Further pre-decode circuitry is associated with the further cache, and, when a partially pre-decoded instruction is routed to the further cache, performs a further pre-decode operation on the partially pre-decoded instruction to generate a corresponding pre-decoded instruction for storage in the further cache.

    Abstract translation: 使用具有至少统一高速缓存的分级缓存来存储指令和数据值,以及耦合在处理电路和统一高速缓存之间的另外的高速缓存。 统一缓存具有被识别为指令高速缓存行或数据高速缓存行的多个高速缓存行。 每个数据高速缓存行存储至少一个数据值和相关信息。 预解码电路与统一高速缓存相关联,并对该指令高速缓存行的接收指令执行第一预解码操作,以便产生用于存储在指令高速缓存行中的对应的部分预解码指令。 进一步的预解码电路与另外的高速缓存相关联,并且当部分预解码的指令被路由到另一高速缓存时,对部分预解码的指令执行进一步的预解码操作,以产生对应的预解码指令 用于存储在另外的缓存中。

    DATA PROCESSING APPARATUS AND METHOD FOR TRANSFERRING WORKLOAD BETWEEN SOURCE AND DESTINATION PROCESSING CIRCUITRY
    15.
    发明申请
    DATA PROCESSING APPARATUS AND METHOD FOR TRANSFERRING WORKLOAD BETWEEN SOURCE AND DESTINATION PROCESSING CIRCUITRY 审中-公开
    数据处理设备和传输源和目标处理电路之间的工作负载的方法

    公开(公告)号:US20130311725A1

    公开(公告)日:2013-11-21

    申请号:US13873597

    申请日:2013-04-30

    Applicant: ARM Limited

    Abstract: In response to a transfer stimulus, performance of a processing workload is transferred from a source processing circuitry to a destination processing circuitry, in preparation for the source processing circuitry to be placed in a power saving condition following the transfer. To reduce the number of memory fetches required by the destination processing circuitry following the transfer, a cache of the source processing circuitry is maintained in a powered state for a snooping period. During the snooping period, cache snooping circuitry snoops data values in the source cache and retrieves the snoop data values for the destination processing circuitry.

    Abstract translation: 响应于传输刺激,处理工作负载的性能从源处理电路传送到目的地处理电路,以准备将源处理电路置于转移之后的省电状态。 为了减少转移之后目的地处理电路所需的存储器获取数量,源处理电路的高速缓存保持在用于窥探期的供电状态。 在窥探期间,缓存窥探电路监听源缓存中的数据值,并检索目标处理电路的窥探数据值。

    BRANCH TARGET ADDRESS PROVISION
    17.
    发明申请

    公开(公告)号:US20190303160A1

    公开(公告)日:2019-10-03

    申请号:US15939722

    申请日:2018-03-29

    Applicant: Arm Limited

    Abstract: An apparatus and method of operating an apparatus are provided. The apparatus comprises execution circuitry to perform data processing operations specified by instructions and instruction retrieval circuitry to retrieve the instructions from memory, wherein the instructions comprise branch instructions. The instruction retrieval circuitry comprises branch target storage to store target instruction addresses for the branch instructions and branch target prefetch circuitry to prepopulate the branch target storage with predicted target instruction addresses for the branch instructions. An improved hit rate in the branch target storage may thereby be supported.

    DATA PROCESSING
    19.
    发明申请
    DATA PROCESSING 审中-公开

    公开(公告)号:US20170139708A1

    公开(公告)日:2017-05-18

    申请号:US14941840

    申请日:2015-11-16

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3851 G06F9/3836

    Abstract: Data processing circuitry comprises instruction queue circuitry to maintain one or more instruction queues to store fetched instructions; instruction decode circuitry to decode instructions dispatched from the one or more instruction queues, the instruction decode circuitry being configured to allocate one or more processor resources of a set of processor resources to a decoded instruction for use in execution of that decoded instruction; detection circuitry to detect, for an instruction to be dispatched from a given instruction queue, a prediction indicating whether sufficient processor resources are predicted to be available for allocation to that instruction by the instruction decode circuitry; and dispatch circuitry to dispatch an instruction from the given instruction queue to the instruction decode circuitry, the dispatch circuitry being responsive to the detection circuitry to allow deletion of the dispatched instruction from that instruction queue when the prediction indicates that sufficient processor resources are predicted to be available for allocation to that instruction by the instruction decode circuitry.

    CONTROLLING EXECUTION OF INSTRUCTIONS FOR A PROCESSING PIPELINE HAVING FIRST AND SECOND EXECUTION CIRCUITRY
    20.
    发明申请
    CONTROLLING EXECUTION OF INSTRUCTIONS FOR A PROCESSING PIPELINE HAVING FIRST AND SECOND EXECUTION CIRCUITRY 有权
    控制执行第一和第二执行电路的处理管道的说明

    公开(公告)号:US20160357554A1

    公开(公告)日:2016-12-08

    申请号:US14731789

    申请日:2015-06-05

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3836 G06F9/3855 G06F9/3873 G06F9/3889

    Abstract: An apparatus comprises a processing pipeline comprising out-of-order execution circuitry and second execution circuitry. Control circuitry monitors at least one reordering metric indicative of an extent to which instructions are executed out of order by the out-of-order execution circuitry, and controls whether instructions are executed using the out-of-order execution circuitry or the second execution circuitry based on the reordering metric. A speculation metric indicative of a fraction of executed instructions that are flushed due to a mis-speculation can also be used to determine whether to execute instructions on first or second execution circuitry having different performance or energy consumption characteristics.

    Abstract translation: 一种装置包括一个包括无序执行电路和第二执行电路的处理流水线。 控制电路监视至少一个重新排序度量,其指示由无序执行电路执行的指令不顺序的程度,并且控制是否使用无序执行电路或第二执行电路来执行指令 基于重新排序指标。 指示由于错误推测而被刷新的执行指令的一部分的猜测度量也可以用于确定是否执行具有不同性能或能量消耗特性的第一或第二执行电路上的指令。

Patent Agency Ranking