Methods and apparatus for monitoring prefetcher accuracy information using a prefetch flag independently accessible from prefetch tag information

    公开(公告)号:US11086781B2

    公开(公告)日:2021-08-10

    申请号:US16658463

    申请日:2019-10-21

    Applicant: Arm Limited

    Abstract: Examples of the present disclosure relate to an apparatus comprising processing circuitry to perform data processing operations and a hierarchical cache structure. The cache structure comprises a plurality of cache levels to store data for access by the processing circuitry, and includes a highest cache level arranged to receive data requests directly from the processing circuitry. The apparatus comprises a plurality of prefetch units, each prefetch unit being associated with a cache level and being arranged to prefetch data into the associated cache level in anticipation of the processing circuitry requiring the data, wherein: each cache level has a plurality of entries and is arranged to maintain prefetch tag information for each entry which, when a given entry contains prefetched data, indicates which prefetch unit of that cache level and/or of a lower cache level prefetched that data; and each cache level is arranged, responsive to a data request from a higher cache level, to provide to the higher cache level the requested data and the prefetch tag information corresponding to the requested data. The apparatus further comprises accuracy information storage to: maintain accuracy inferring information for each prefetch unit; and when given data is evicted from a cache level, update the accuracy inferring information based on the prefetch tag information.

    Increasing effective cache associativity

    公开(公告)号:US11138119B2

    公开(公告)日:2021-10-05

    申请号:US16247912

    申请日:2019-01-15

    Applicant: Arm Limited

    Abstract: There is provided an apparatus that includes storage circuitry. The storage circuitry is made up from a plurality of sets, each of the sets having at least one storage location. Receiving circuitry receives an access request that includes an input address. Lookup circuitry obtains a plurality of candidate sets that correspond with an index part of the input address. The lookup circuitry determines a selected storage location from the candidate sets using an access policy. The access policy causes the lookup circuitry to iterate through the candidate sets to attempt to locate an appropriate storage location. The appropriate storage location is accessed in response to the appropriate storage location being found.

    Marking long latency instruction as branch in pending instruction table and handle as mis-predicted branch upon interrupting event to return to checkpointed state
    3.
    发明授权
    Marking long latency instruction as branch in pending instruction table and handle as mis-predicted branch upon interrupting event to return to checkpointed state 有权
    将长延迟指令标记为待处理指令表中的分支,并在中断事件返回到检查点状态时处理为误预测分支

    公开(公告)号:US09513925B2

    公开(公告)日:2016-12-06

    申请号:US14031281

    申请日:2013-09-19

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3861 G06F9/30145 G06F9/3842 G06F9/3863

    Abstract: A data processing apparatus and method of data processing are provided. The data processing apparatus comprises execution circuitry configured to execute a sequence of program instructions. Checkpoint circuitry is configured to identify an instance of a predetermined type of instruction in the sequence of program instructions and to store checkpoint information associated with that instance. The checkpoint information identifies a state of the data processing apparatus prior to execution of that instance of the predetermined type of instruction, wherein the predetermined type of instruction has an expected long completion latency. If the execution circuitry does not complete execution of that instance of the predetermined type of instruction due to occurrence of a predetermined event, the data processing apparatus is arranged to reinstate the state of the data processing apparatus with reference to the checkpoint information, such that the execution circuitry is then configured to recommence execution of the sequence of program instructions at that instance of the predetermined type of instruction.

    Abstract translation: 提供数据处理装置和数据处理方法。 该数据处理装置包括被配置为执行程序指令序列的执行电路。 检查点电路被配置为识别程序指令序列中的预定类型的指令的实例,并且存储与该实例相关联的检查点信息。 检查点信息在执行预定类型的指令的该实例之前识别数据处理装置的状态,其中预定类型的指令具有期望的长完成延迟。 如果执行电路由于发生预定事件而没有完成预定类型的指令的实例的执行,则数据处理装置被配置为参照检查点信息恢复数据处理装置的状态,使得 然后,执行电路被配置为在预定类型的指令的那个情况下重新开始执行程序指令的序列。

    Handling move instructions via register renaming or writing to a different physical register using control flags

    公开(公告)号:US10528355B2

    公开(公告)日:2020-01-07

    申请号:US14757576

    申请日:2015-12-24

    Applicant: ARM LIMITED

    Abstract: An apparatus has processing circuitry, register rename circuitry and control circuitry which selects one of first and second move handling techniques for handling a move instruction specifying a source logical register and a destination logical register. In the first technique, the register rename circuitry maps the destination logical register of the move to the same physical register as the source logical register. In the second technique, the processing circuitry writes a data value read from a physical register corresponding to the source logical register to a different physical register corresponding to the destination local register. The second technique is selected when the move instruction specifies the same source logical register as one of the source and destination logical registers as an earlier move instruction handled according to the first technique, and the register mapping used for that register when handling the earlier move instruction is still current.

    Prefetch store filtering
    5.
    发明授权

    公开(公告)号:US12141069B2

    公开(公告)日:2024-11-12

    申请号:US18147068

    申请日:2022-12-28

    Applicant: Arm Limited

    Abstract: A data processing apparatus is provided. Prefetch circuitry generates a prefetch request for a cache line prior to the cache line being explicitly requested. The cache line is predicted to be required for a store operation in the future. Issuing circuitry issues the prefetch request to a memory hierarchy and filter circuitry filters the prefetch request based on at least one other prefetch request made to the cache line, to control whether the prefetch request is issued by the issuing circuitry.

    Methods and apparatus for predicting instructions for execution

    公开(公告)号:US11900121B2

    公开(公告)日:2024-02-13

    申请号:US17501257

    申请日:2021-10-14

    Applicant: Arm Limited

    CPC classification number: G06F9/3844 G06F9/30047 G06F9/3802 G06F9/3861

    Abstract: Aspects of the present disclosure relate to an apparatus comprising prediction circuitry having a plurality of hierarchical prediction units to perform respective hierarchical predictions of instructions for execution, wherein predictions higher in the hierarchy have a higher expected accuracy than predictions lower in the hierarchy. Responsive to a given prediction higher in the hierarchy being different to a corresponding prediction lower in the hierarchy, the corresponding prediction lower in the hierarchy is corrected. A prediction correction metric determination unit determines a prediction correction metric indicative of an incidence of uncorrected predictions performed by the prediction circuitry. Fetch circuitry fetches instructions predicted by at least one of said plurality of hierarchical predictions, and delays said fetching based on the prediction correction metric indicating an incidence of uncorrected predictions below a threshold.

    Speculative register file read suppression
    7.
    发明授权
    Speculative register file read suppression 有权
    推测寄存器文件读取抑制

    公开(公告)号:US09542194B2

    公开(公告)日:2017-01-10

    申请号:US14482146

    申请日:2014-09-10

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3857 G06F9/384

    Abstract: A single threaded out-of-order processor 2 includes an architected register file 22 and a speculative register file 20. Speculative register allocation circuitry 24 serves to allocate speculative registers for use in accordance with an allocation sequence and taken from a position determined by a tail point. Read suppression circuitry 30 serves to maintain a boundary pointer corresponding to a position within the allocation sequence such that no speculative register more recently allocated within the allocation sequence than that corresponding to the boundary pointer can have a valid register value. The read suppression circuitry 30 serves to suppress read operations for source operands lying within a read-suppression region delimited by the tail point and the boundary pointer. Separate boundary pointers may be maintained for different types of register values, such as integer register values and floating point register values.

    Abstract translation: 单线程无序处理器2包括架构化寄存器文件22和推测寄存器文件20.推测性寄存器分配电路24用于根据分配序列分配推测寄存器,并从由尾部确定的位置 点。 读取抑制电路30用于维持与分配序列内的位置相对应的边界指针,使得在分配序列内最近不再分配比与边界指针对应的推测寄存器可以具有有效的寄存器值。 读取抑制电路30用于抑制位于由尾点和边界指针限定的读取抑制区域内的源操作数的读取操作。 可以为不同类型的寄存器值保持单独的边界指针,例如整数寄存器值和浮点寄存器值。

Patent Agency Ranking