CACHE ACCESS DETECTION AND PREDICTION
    11.
    发明申请

    公开(公告)号:US20200250098A1

    公开(公告)日:2020-08-06

    申请号:US16267613

    申请日:2019-02-05

    Applicant: Arm Limited

    Abstract: An apparatus comprises a cache memory to store data as a plurality of cache lines each having a data size and an associated physical address in a memory, access circuitry to access the data stored in the cache memory, detection circuitry to detect, for at least a set of sub-units of the cache lines stored in the cache memory, whether a number of accesses by the access circuitry to a given sub-unit exceeds a predetermined threshold, in which each sub-unit has a data size that is smaller than the data size of a cache line, prediction circuitry to generate a prediction, for a given region of a plurality of regions of physical address space, of whether data stored in that region comprises streaming data in which each of one or more portions of the given cache line is predicted to be subject to a maximum of one read operation or multiple access data in which each of the one or more portions of the given cache line is predicted to be subject to more than one read operation, the prediction circuitry being configured to generate the prediction in response to a detection by the detection circuitry of whether the number of accesses to a sub-unit of a cache line having an associated physical address in the given region exceeds the predetermined threshold, and allocation circuitry to selectively allocate a next cache line to the cache memory in dependence upon the prediction applicable to the region of physical address space containing that next cache line.

    APPARATUS AND METHOD FOR ESTIMATING A SHIFT AMOUNT WHEN PERFORMING FLOATING-POINT SUBTRACTION

    公开(公告)号:US20180285076A1

    公开(公告)日:2018-10-04

    申请号:US15473841

    申请日:2017-03-30

    Applicant: ARM Limited

    CPC classification number: G06F7/485 G06F5/012

    Abstract: An apparatus and method are provided for estimating a shift amount when employing processing circuitry to perform a subtraction operation to subtract a second significand value of a second floating-point operand from a first significand value of a first floating-point operand in order to generate a difference value. Shift estimation circuitry then determines an estimated shift amount to be applied to the difference value. The shift estimation circuitry comprises significand analysis circuitry to generate, from analysis of the significand values of the two floating-point operands, a first bit string identifying a most significant bit position within the difference value that is predicted to have its bit set to a determined value. In parallel, shift limiting circuitry generates from an exponent value a second bit string identifying a shift limit bit position. The shift limiting circuitry has computation circuitry to perform, for each bit position in at least a subset of bit positions of the second bit string, an associated computation using bits of the exponent value to determine a value for that bit position within the second bit string. The associated computation is different for different bit positions. Combining circuitry then generates a combined bit string from the first and second bit strings, and shift determination circuitry determines the estimated shift amount from the combined bit string.

    DATA STORAGE
    13.
    发明申请
    DATA STORAGE 审中-公开

    公开(公告)号:US20170249085A1

    公开(公告)日:2017-08-31

    申请号:US15440254

    申请日:2017-02-23

    Applicant: ARM Limited

    Abstract: Data storage apparatus comprises detection circuitry configured to detect a match between a multi-bit reference memory address and a test address, the test address being a combination of a multi-bit base address and a multi-bit address offset, the detection circuitry comprising: a comparator configured to compare, as a first comparison, a first subset of bits of the reference memory address with a combination of the corresponding first subset of bits of the base address and the corresponding first subset of bits of the address offset; the comparator being configured to compare, as a second comparison, a second, different subset of bits of the reference memory address with the corresponding second subset of bits of the base address; a detector configured to detect the match between the reference memory address and the test address when both of the first comparison and the second comparison detect a respective match; and control circuitry configured to control operation of the data storage apparatus in dependence upon the reference memory address when a match is detected by the detector.

    APPARATUS AND METHOD FOR PROCESSING INSTRUCTIONS FROM A PLURALITY OF THREADS

    公开(公告)号:US20170132011A1

    公开(公告)日:2017-05-11

    申请号:US14935820

    申请日:2015-11-09

    Applicant: ARM Limited

    Abstract: An apparatus and method are provided for processing instructions from a plurality of threads. The apparatus comprises a processing pipeline to process instructions, including fetch circuitry to fetch instructions from a plurality of threads for processing by the processing pipeline, and execution circuitry to execute the fetched instructions. Execution hint instruction handling circuitry is then responsive to the fetch circuitry fetching an execution hint instruction for a first thread, to treat the execution hint instruction, at least in a presence of a suspension condition, as a predicted branch instruction with a predicted behaviour, and to cause the fetch circuitry to suspend fetching of instructions for the first thread. The execution circuitry is then arranged to execute the predicted branch instruction with a behaviour different to the predicted behaviour, in order to trigger a misprediction condition. The fetch circuitry is then responsive to the misprediction condition to resume fetching of instructions for the first thread. This provides a reliable mechanism for temporarily suspending fetching of instructions for a thread in response to a hint instruction, whilst still reliably resuming fetching in due course.

    INSTRUCTION FUSION
    15.
    发明申请
    INSTRUCTION FUSION 审中-公开

    公开(公告)号:US20170123808A1

    公开(公告)日:2017-05-04

    申请号:US14929904

    申请日:2015-11-02

    Applicant: ARM Limited

    Abstract: An apparatus includes a processing pipeline comprising a plurality of stages, the plurality of stages including at least one instruction fusing stage to detect whether a block of instructions to be processed comprises a fusible group of instructions, and to generate a fused instruction to be processed by a subsequent stage of the processing pipeline when said block of instructions comprises said fusible group. However, when said block of instructions comprises a partial subset of said fusible group of instructions, the instruction fusing stage is configured to delay handling of said partial subset of said fusible group of instructions until the instruction fusing stage has determined whether at least one subsequent block of instructions to be processed comprises a remaining subset of instructions of said fusible group.

    DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING DATA PROCESSING OPERATION WITH A CONDITIONAL PROCESSING STEP
    16.
    发明申请
    DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING DATA PROCESSING OPERATION WITH A CONDITIONAL PROCESSING STEP 审中-公开
    数据处理装置和用于通过条件处理步骤执行数据处理操作的方法

    公开(公告)号:US20150261542A1

    公开(公告)日:2015-09-17

    申请号:US14210621

    申请日:2014-03-14

    Applicant: ARM LIMITED

    CPC classification number: G06F9/30014 G06F9/3875 G06F9/3893

    Abstract: A data processing apparatus has a pipeline for performing a processing operation involving a conditional step which is required only if at least one input operand satisfies a predetermined condition. Control circuitry detects whether the condition is satisfied. If not, then the pipeline is controlled to perform the operation bypassing the conditional step to generate the output operand a first number of cycles later than a start cycle in which the operation starts, and the output operand is forwarded over a forwarding path. If the condition is satisfied, then the pipeline performs the operation including the conditional step to generate the output operand a second number of cycles later than the start cycle, where the second number is greater than the first number. The output operand is written to a destination register the same number of cycles later than the start cycle regardless of whether the condition is satisfied.

    Abstract translation: 数据处理装置具有用于执行涉及仅当至少一个输入操作数满足预定条件时才需要的条件步骤的处理操作的流水线。 控制电路检测条件是否满足。 如果没有,则控制流水线以执行绕过条件步骤的操作,以在操作开始的开始周期之后的第一个周期生成输出操作数,并且通过转发路径转发输出操作数。 如果满足条件,则流水线执行包括条件步骤的操作,以产生比起始周期晚的第二数量的循环,其中第二个数字大于第一个数字。 无论条件是否满足,输出操作数都将写入到目标寄存器中相同数量的周期。

    METHOD AND APPARATUS FOR INTERRUPT HANDLING
    17.
    发明申请
    METHOD AND APPARATUS FOR INTERRUPT HANDLING 有权
    用于中断处理的方法和装置

    公开(公告)号:US20140351472A1

    公开(公告)日:2014-11-27

    申请号:US13900777

    申请日:2013-05-23

    Applicant: ARM LIMITED

    CPC classification number: G06F13/24 G06F9/4812 G06F9/4818

    Abstract: A data processing device comprises a plurality of system registers and a set of interrupt handling registers for controlling handling of an incoming interrupt. The device also includes processing circuitry configured to execute software of the plurality of execution levels, and interrupt controller circuitry configured to route said incoming interrupts to interrupt handling software that is configured to run at one of said plurality of execution levels, and register access control circuitry configured to dynamically control access to at least some of said interrupt handling registers in dependence upon one of said plurality of execution levels that said incoming interrupt is routed to. The interrupt handling software configured to run at a particular execution level does not have access to interrupt handling registers for handling a different incoming interrupt that is routed to interrupt handling software that is configured to run at a more privileged execution level.

    Abstract translation: 数据处理装置包括多个系统寄存器和一组用于控制进入中断的处理的中断处理寄存器。 所述设备还包括被配置为执行所述多个执行级别的软件的处理电路,以及被配置为将所述输入中断路由到中断处理软件的中断控制器电路,所述中断处理软件被配置为在所述多个执行级中的一个执行级别运行,并且将访问控制电路 配置为根据所述多个执行级别中的一个来动态地控制对至少一些所述中断处理寄存器的访问,所述多个执行级别中的所述进入中断被路由到。 配置为在特定执行级别运行的中断处理软件无法访问中断处理寄存器,用于处理被配置为以更特权的执行级运行的中断处理软件的不同输入中断。

    REGISTER-BASED MATRIX MULTIPLICATION

    公开(公告)号:US20220291923A1

    公开(公告)日:2022-09-15

    申请号:US17678221

    申请日:2022-02-23

    Applicant: Arm Limited

    Abstract: Techniques for performing matrix multiplication in a data processing apparatus are disclosed, comprising apparatuses, matrix multiply instructions, methods of operating the apparatuses, and virtual machine implementations. Registers, each register for storing at least four data elements, are referenced by a matrix multiply instruction and in response to the matrix multiply instruction a matrix multiply operation is carried out. First and second matrices of data elements are extracted from first and second source registers, and plural dot product operations, acting on respective rows of the first matrix and respective columns of the second matrix are performed to generate a square matrix of result data elements, which is applied to a destination register. A higher computation density for a given number of register operands is achieved with respect to vector-by-element techniques.

    CACHE CONTROL IN PRESENCE OF SPECULATIVE READ OPERATIONS

    公开(公告)号:US20210042227A1

    公开(公告)日:2021-02-11

    申请号:US16979624

    申请日:2019-03-12

    Applicant: Arm Limited

    Abstract: Coherency control circuitry (10) supports processing of a safe-speculative-read transaction received from a requesting master device (4). The safe-speculative-read transaction is of a type requesting that target data is returned to a requesting cache (11) of the requesting master device (4) while prohibiting any change in coherency state associated with the target data in other caches (12) in response to the safe-speculative-read transaction. In response, at least when the target data is cached in a second cache associated with a second master device, at least one of the coherency control circuitry (10) and the second cache (12) is configured to return a safe-speculative-read response while maintaining the target data in the same coherency state within the second cache. This helps to mitigate against speculative side-channel attacks.

    TRACKING SPECULATIVE DATA CACHING
    20.
    发明申请

    公开(公告)号:US20210026641A1

    公开(公告)日:2021-01-28

    申请号:US17043963

    申请日:2019-03-21

    Applicant: Arm Limited

    Abstract: An apparatus and method of operating a data processing apparatus are disclosed. The apparatus comprises data processing circuitry to perform data processing operations in response to a sequence of instructions, wherein the data processing circuitry is capable of performing speculative execution of at least some of the sequence of instructions. A cache structure comprising entries stores temporary copies of data items which are subjected to the data processing operations and speculative execution tracking circuitry monitors correctness of the speculative execution and responsive to indication of incorrect speculative execution to cause entries in the cache structure allocated by the incorrect speculative execution to be evicted from the cache structure.

Patent Agency Ranking