APPARATUS AND METHOD FOR ACCESSING AN ADDRESS TRANSLATION CACHE

    公开(公告)号:US20190310948A1

    公开(公告)日:2019-10-10

    申请号:US15945900

    申请日:2018-04-05

    Applicant: Arm Limited

    Abstract: An apparatus and method are provided for accessing an address translation cache. The address translation cache has a plurality of entries, where each entry is used to store address translation data used when converting a virtual address into a corresponding physical address of a memory system. The virtual address is generated from a plurality of source values. Allocation circuitry is responsive to received address translation data, to allocate an entry within the address translation cache to store the received address translation data. A hash value indication is associated with the allocated entry, where the hash value indication is computed from the plurality of source values used to generate a virtual address associated with the received address translation data. Lookup circuitry is responsive to an access request associated with a target virtual address, to perform a lookup process employing a target hash value computed from the plurality of source values used to generate the target virtual address, in order to identify any candidate matching entry in the address translation cache. When there is at least one candidate matching entry, a virtual address check process is then performed in order to determine whether any candidate matching entry is an actual matching entry whose address translation data enables the target virtual address to be translated to a corresponding target physical address. Such an approach can significantly improve the performance of accesses to the address translation cache, and can also give rise to power consumption savings.

    APPARATUS AND METHOD FOR HANDLING PAGE INVALIDATE REQUESTS IN AN ADDRESS TRANSLATION CACHE

    公开(公告)号:US20190294551A1

    公开(公告)日:2019-09-26

    申请号:US15928165

    申请日:2018-03-22

    Applicant: Arm Limited

    Inventor: ABHISHEK RAJA

    Abstract: An apparatus is provided having processing circuitry for executing multiple items of supervised software under the control of a supervising element, and a set associative address translation cache having a plurality of entries, where each entry stores address translation data used when converting a virtual address into a corresponding physical address of a memory system comprising multiple pages. The address translation data is obtained by a multi-stage address translation process comprising a first stage translation process managed by an item of supervised software and a second stage translation process managed by the supervising element. Allocation circuitry is responsive to receipt of obtained address translation data for a specified virtual address, to allocate the obtained address translation data into an entry of a selected set of the address translation cache, where the selected set is identified using a subset of bits of the specified virtual address chosen in dependence on a final page size associated with the obtained address translation data. Filter circuitry is provided having a plurality of filter entries, and is responsive to detecting that a splinter condition exists for the obtained address translation data, to indicate in a chosen filter entry that the splinter condition has been detected for the specified item of supervised software that is associated with the obtained address translation data. The splinter condition exists when a first stage page size used in the multi-stage translation process exceeds the final page size. Maintenance circuitry is then responsive to a page invalidate request associated with an item of supervised software, to reference the filter circuitry to determine which entries of the address translation cache need to be checked in order to process the page invalidate request, in dependence on whether a filter entry of the filter circuitry indicates presence of the splinter condition for that item of supervised software.

    MEMORY ADDRESS TRANSLATION
    4.
    发明申请

    公开(公告)号:US20230135599A1

    公开(公告)日:2023-05-04

    申请号:US17512888

    申请日:2021-10-28

    Applicant: Arm Limited

    Abstract: Circuitry comprises a translation lookaside buffer to store memory address translations, each memory address translation being between an input memory address range defining a contiguous range of one or more input memory addresses in an input memory address space and a translated output memory address range defining a contiguous range of one or more output memory addresses in an output memory address space; in which the translation lookaside buffer is configured selectively to store the memory address translations as a cluster of memory address translations, a cluster defining memory address translations in respect of a contiguous set of input memory address ranges by encoding one or more memory address offsets relative to a respective base memory address; memory management circuitry to retrieve data representing memory address translations from a memory, for storage by the translation lookaside buffer, when a required memory address translation is not stored by the translation lookaside buffer; detector circuitry to detect an action consistent with access, by the translation lookaside buffer, to a given cluster of memory address translations; and prefetch circuitry, responsive to a detection of the action consistent with access to a cluster of memory address translations, to prefetch data from the memory representing one or more further memory address translations of a further set of input memory address ranges adjacent to the contiguous set of input memory address ranges for which the given cluster defines memory address translations.

    HANDLING OF SINGLE-COPY-ATOMIC LOAD/STORE INSTRUCTION

    公开(公告)号:US20230017802A1

    公开(公告)日:2023-01-19

    申请号:US17374149

    申请日:2021-07-13

    Applicant: Arm Limited

    Abstract: In response to a single-copy-atomic load/store instruction for requesting an atomic transfer of a target block of data between the memory system and the registers, where the target block has a given size greater than a maximum data size supported for a single load/store micro-operation by a load/store data path, instruction decoding circuitry maps the single-copy-atomic load/store instruction to two or more mapped load/store micro-operations each for requesting transfer of a respective portion of the target block of data. In response to the mapped load/store micro-operations, load/store circuitry triggers issuing of a shared memory access request to the memory system to request the atomic transfer of the target block of data of said given size to or from the memory system, and triggers separate transfers of respective portions of the target block of data over the load/store data path.

    ALLOCATION OF STORE REQUESTS
    6.
    发明公开

    公开(公告)号:US20240078012A1

    公开(公告)日:2024-03-07

    申请号:US17903293

    申请日:2022-09-06

    Applicant: Arm Limited

    CPC classification number: G06F3/0611 G06F3/0656 G06F3/0673

    Abstract: There is provided an apparatus, method and medium. The apparatus comprises a store buffer to store a plurality of store requests, where each of the plurality of store requests identifies a storage address and a data item to be transferred to storage beginning at the storage address, where the data item comprises a predetermined number of bytes. The apparatus is responsive to a memory access instruction indicating a store operation specifying storage of N data items, to determine an address allocation order of N consecutive store requests based on a copy direction hint indicative of whether the memory access instruction is one of a sequence of memory access instructions each identifying one of a sequence of sequentially decreasing addresses, and to allocate the N consecutive store requests to the store buffer in the address allocation order.

    DETERMINING A RESTART POINT IN OUT-OF-ORDER EXECUTION

    公开(公告)号:US20230214223A1

    公开(公告)日:2023-07-06

    申请号:US17569157

    申请日:2022-01-05

    Applicant: Arm Limited

    CPC classification number: G06F9/3855 G06F9/3016

    Abstract: There is provided a data processing apparatus comprising decode circuitry responsive to receipt of a block of instructions to generate control signals indicative of each of the block of instructions, and to analyse the block of instructions to detect a potential hazard instruction. The data processing apparatus is provided with decode circuitry to encode information indicative of a clean restart point into the control signals associated with the potential hazard instruction. The data processing apparatus is provided with data processing circuitry to perform out-of-order execution of at least some of the block of instructions, and control circuitry responsive to a determination, at execution of the potential hazard instruction, that data values used as operands for the potential hazard instruction have been modified by out-of-order execution of a subsequent instruction, to restart execution from the clean restart point and to flush held data values from the data processing circuitry.

    APPARATUS AND METHOD FOR MAINTAINING ADDRESS TRANSLATION DATA WITHIN AN ADDRESS TRANSLATION CACHE

    公开(公告)号:US20180107604A1

    公开(公告)日:2018-04-19

    申请号:US15293467

    申请日:2016-10-14

    Applicant: ARM LIMITED

    Abstract: An apparatus and method are provided for maintaining address translation data within an address translation cache. The address translation cache has a plurality of entries, where each entry is used to store address translation data used when converting a virtual address into a corresponding physical address of a memory system. Control circuitry is used to perform an allocation process to determine the address translation data to be stored in each entry. The address translation cache is used to store address translation data of a plurality of different types representing address translation data specified at respective different levels of address translation within a multiple-level page table walk. The plurality of different types comprises a final level type of address translation data that identifies a full translation from the virtual address to the physical address, and at least one intermediate level type of address translation data that identifies a partial translation of the virtual address. The control circuitry is arranged, when performing the allocation process, to apply an allocation policy that permits each of the entries to be used for any of the different types of address translation data, and to store type identification data in association with each entry to enable the type of the address translation data stored therein to be determined. Such an approach enables very efficient usage of the address translation cache resources, for example by allowing the proportion of the entries used for full address translation data and the proportion of the entries used for partial address translation data to be dynamically adapted to changing workload conditions.

    CONTROL OF BULK MEMORY INSTRUCTIONS
    9.
    发明公开

    公开(公告)号:US20240036760A1

    公开(公告)日:2024-02-01

    申请号:US17875758

    申请日:2022-07-28

    Applicant: Arm Limited

    Abstract: An apparatus supports decoding and execution of a bulk memory instruction specifying a block size parameter. The apparatus comprises control circuitry to determine whether the block size corresponding to the block size parameter exceeds a predetermined threshold, and performs a micro-architectural control action to influence the handling of at least one bulk memory operation by memory operation processing circuitry. The micro-architectural control action varies depending on whether the block size exceeds the predetermined threshold, and further depending on the states of other components and operations within or coupled with the apparatus. The micro-architectural control action could include an alignment correction action, cache allocation control action, or processing circuitry selection action.

    CIRCUITRY AND METHOD
    10.
    发明申请

    公开(公告)号:US20220318051A1

    公开(公告)日:2022-10-06

    申请号:US17218425

    申请日:2021-03-31

    Applicant: Arm Limited

    Abstract: Circuitry comprises two or more clusters of execution units, each cluster comprising one or more execution units to execute processing instructions; and scheduler circuitry to maintain one or more queues of processing instructions, the scheduler circuitry comprising picker circuitry to select a queued processing instruction for issue to an execution unit of one of the clusters of execution units for execution; in which: the scheduler circuitry is configured to maintain dependency data associated with each queued processing instruction, the dependency data for a queued processing instruction indicating any source operands which are required to be available for use in execution of that queued processing instruction and to inhibit issue of that queued processing instruction until all of the required source operands for that queued processing instruction are available and is configured to be responsive to an indication to the scheduler circuitry of the availability of the given operand as a source operand for use in execution of queued processing instructions; and the scheduler circuitry is responsive to an indication of availability of one or more last awaited source operands for a given queued processing instruction, to inhibit issue by the scheduler circuitry of the given queued processing instruction to an execution unit in a cluster of execution units other than a cluster of execution units containing an execution unit which generated at least one of those last awaited source operands.

Patent Agency Ranking