-
公开(公告)号:US20210026632A1
公开(公告)日:2021-01-28
申请号:US16521663
申请日:2019-07-25
Applicant: Arm Limited
Inventor: Miles Robert DOOLEY , Balaji VIJAYAN , Huzefa Moiz SANJELIWALA , . ABHISHEK RAJA , Sharmila SHRIDHAR
IPC: G06F9/30
Abstract: An apparatus is described, comprising load issuing circuitry configured to issue load operations to load data from memory, and memory ordering tracking storage circuitry configured to store memory ordering tracking information on issued load operations. The apparatus also includes control circuitry configured to access the memory ordering tracking storage circuitry to determine, using the memory ordering tracking information, whether at least one load operation has been issued in disagreement with a memory ordering requirement, and, if so, to determine whether to re-issue one or more issued load operations or to continue issuing load operations despite disagreement with the memory ordering requirement. Furthermore, the control circuitry is capable of merging the memory ordering tracking information for a plurality of issued load operations into a merged entry in the memory ordering tracking storage circuitry.
-
公开(公告)号:US20180253387A1
公开(公告)日:2018-09-06
申请号:US15446235
申请日:2017-03-01
Applicant: ARM Limited
Inventor: Huzefa Moiz SANJELIWALA , Klas Magnus BRUCE , Leigang KOU , Michael FILIPPO , Miles Robert DOOLEY , Matthew Andrew RAFACZ
IPC: G06F12/12 , G06F12/0897
CPC classification number: G06F12/0897 , G06F12/0862 , G06F2212/1028 , G06F2212/1041 , G06F2212/60
Abstract: A data processing apparatus is provided that includes a plurality of storage elements. Receiving circuitry receives a plurality of incoming data beats from cache circuitry and stores the incoming data beats in the storage elements. At least one existing data beat in the storage elements is replaced by an equal number of the incoming data beats belonging to a different cache line of the cache circuitry. The existing data beats stored in said plurality of storage elements form an incomplete cache line.
-
公开(公告)号:US20160328320A1
公开(公告)日:2016-11-10
申请号:US14702972
申请日:2015-05-04
Applicant: ARM LIMITED
Inventor: Miles Robert DOOLEY , Todd RAFACZ , Guy LARRI
IPC: G06F12/08
CPC classification number: G06F12/0895 , G06F12/0864 , G06F12/1027 , Y02B70/30 , Y02D10/13
Abstract: A cache is provided comprising a plurality of ways, each way of the plurality of ways comprising a data array, wherein a data item stored by the cache is stored in the data array of one of the plurality of ways. A way tracker of the cache has a plurality of entries, each entry of the plurality of entries for storing a data item identifier and for storing, in association with the data item identifier, an indication of a selected way of the plurality of ways to indicate that a data item identified by the data item identifier is stored in the selected way. Each entry of the way tracker is further for storing a miss indicator in association with the data item identifier, wherein the miss indicator is set by the cache when a lookup for a data item identified by that data item identifier has resulted in a cache miss. A corresponding method of caching data is also provided.
Abstract translation: 提供包括多个方式的高速缓存,多个方式的每一路包括数据阵列,其中高速缓存存储的数据项被存储在多个方式之一的数据阵列中。 高速缓存的方式跟踪器具有多个条目,多个条目的每个条目用于存储数据项标识符,并且用于与数据项标识符相关联地存储指示多个方式的所选方式的指示 以所选择的方式存储由数据项标识符标识的数据项。 方式跟踪器的每个条目还用于存储与数据项标识符相关联的未命中指示符,其中当由该数据项标识符标识的数据项的查找导致高速缓存未命中时,该高速缓存设置该未命中指示符。 还提供了缓存数据的相应方法。
-
公开(公告)号:US20220129186A1
公开(公告)日:2022-04-28
申请号:US17078304
申请日:2020-10-23
Applicant: Arm Limited
Inventor: Ho-Seop KIM , Joseph Michael PUSDESRIS , Miles Robert DOOLEY
IPC: G06F3/06
Abstract: A request node is provided, that includes request circuitry for issuing outgoing memory access requests to a remote node. Status receiving circuitry receives statuses regarding remote memory access requests at the remote node and control circuitry controls at least one of a rate or an aggression at which the outgoing memory access requests are issued to the remote node in dependence on at least some of the statuses. The control circuitry is inhibited from controlling the rate or the aggression until multiple statuses are received.
-
公开(公告)号:US20200097411A1
公开(公告)日:2020-03-26
申请号:US16140625
申请日:2018-09-25
Applicant: Arm Limited
Inventor: Joseph Michael PUSDESRIS , Miles Robert DOOLEY , Alexander Cole SHULYAK , Krishnendra NATHELLA , Dam SUNWOO
IPC: G06F12/0862 , G06F5/06 , G06F9/30
Abstract: Apparatuses and methods for prefetch generation are disclosed. Prefetching circuitry receives addresses specified by load instructions and can cause retrieval of a data value from an address before that address is received. Stride determination circuitry determines stride values as a difference between a current address and a previously received address. Plural stride values corresponding to a sequence of received addresses are determined. Multiple stride storage circuitry stores the plurality of stride values determined by the stride determination circuitry. New address comparison circuitry determines whether a current address corresponds to a matching stride value based on the plurality of stride values stored in the multiple stride storage circuitry. Prefetch initiation circuitry can causes a data value to be retrieved from a further address, wherein the further address is the current address modified by the matching stride value of the plurality of stride values. By the use of multiple stride values, more complex load address patterns can be prefetched.
-
公开(公告)号:US20240126458A1
公开(公告)日:2024-04-18
申请号:US17966071
申请日:2022-10-14
Applicant: Arm Limited
Inventor: Stefano GHIGGINI , Natalya Bondarenko , Luca NASSI , Geoffray Matthieu LACOURBA , Huzefa Moiz SANJELIWALA , Miles Robert DOOLEY , . ABHISHEK RAJA
IPC: G06F3/06
CPC classification number: G06F3/0634 , G06F3/0604 , G06F3/0659 , G06F3/0673
Abstract: An apparatus is provided for controlling the operating mode of control circuitry, such that the control circuitry may change between two operating modes. In an allocation mode, data that is loaded in response to an instruction is allocated into storage circuitry from an intermediate buffer, and the data is read from the storage circuitry. In a non-allocation mode, the data is not allocated to the storage circuitry, and is read directly from intermediate buffer. The control of the operating mode may be performed by mode control circuitry, and the mode may be changed in dependence on the type of instruction that calls the data, and whether the data may be used again in the near future, or whether it is expected to be used only once.
-
公开(公告)号:US20200073576A1
公开(公告)日:2020-03-05
申请号:US16118610
申请日:2018-08-31
Applicant: Arm Limited
Inventor: Adrian MONTERO , Miles Robert DOOLEY , Joseph Michael PUSDESRIS , Klas Magnus BRUCE , Chris ABERNATHY
IPC: G06F3/06 , G11B19/04 , G06F9/50 , G06F12/0862
Abstract: Storage circuitry is provided, that is designed to form part of a memory hierarchy. The storage circuitry comprises receiver circuitry for receiving a request to obtain data from the memory hierarchy. Transfer circuitry causes the data to be stored at a selected destination in response to the request, wherein the selected destination is selected in dependence on at least one selection condition. Tracker circuitry tracks the request while the request is unresolved. If at least one selection condition is met then the destination is the storage circuitry and otherwise the destination is other storage circuitry in the memory hierarchy.
-
公开(公告)号:US20190065400A1
公开(公告)日:2019-02-28
申请号:US15685186
申请日:2017-08-24
Applicant: ARM LIMITED
Inventor: Rakesh SHAJI LAL , Miles Robert DOOLEY
IPC: G06F12/1045 , G06F12/02
Abstract: An apparatus and method are provided for efficient utilisation of an address translation cache. The apparatus has an address translation cache with a plurality of entries, where each entry stores address translation data used when converting a virtual address into a corresponding physical address of a memory system. Each entry identifies whether the address translation data stored therein is coalesced or non-coalesced address translation data, and also identifies a page size for a page within the memory system that is associated with that address translation data. Control circuitry is responsive to a virtual address, to perform a lookup operation within the address translation cache to produce, for each page size supported by the address translation cache, a hit indication to indicate whether a hit has been detected for an entry storing address translation data of the associated page size. The control circuitry is further arranged to determine, from at least each hit indication for a page size that is able to be associated with coalesced address translation data, a coalesced multi-hit indication which is set when a hit is detected for both an entry containing coalesced address translation data and for an entry containing non-coalesced address translation data. The control circuitry is then arranged, when the lookup operation has completed, to determine whether multiple hits have been detected, and in that instance to reference the coalesced multi-hit indication to determine whether multiple hits have resulted from both coalesced address translation data and non-coalesced address translation data in the address translation cache. This provides an efficient and precise mechanism for distinguishing between multiple hits caused by hardware coalescing and multiple hits caused by software induced issues.
-
公开(公告)号:US20180107606A1
公开(公告)日:2018-04-19
申请号:US15294031
申请日:2016-10-14
Applicant: ARM LIMITED
Inventor: Barry Duane WILLIAMSON , Michael FILIPPO , . ABHISHEK RAJA , Adrian MONTERO , Miles Robert DOOLEY
IPC: G06F12/1045 , G06F12/128 , G06F12/1009
CPC classification number: G06F12/1063 , G06F12/1009 , G06F12/128 , G06F2212/621 , G06F2212/68 , G06F2212/69
Abstract: A data processing system 2 includes an address translation cache 12 to store a plurality of address translation entries. Eviction control circuitry 10 selects a victim entry for eviction from address translation cache 12 using an eviction control parameter. The address translation cache 12 can store multiple different types of entry corresponding to respective different levels of address translation within a multiple-level page table walk. The different types of entry have different eviction control parameters assigned at the time of allocation. Eviction from the address translation cache is dependent upon the entry type, as well as the subsequent accesses to the entry concerned and the other entries within the address translation cache.
-
10.
公开(公告)号:US20180107604A1
公开(公告)日:2018-04-19
申请号:US15293467
申请日:2016-10-14
Applicant: ARM LIMITED
IPC: G06F12/1009 , G06F12/0802
CPC classification number: G06F12/1009 , G06F12/0802 , G06F12/1027 , G06F2212/60 , G06F2212/651 , G06F2212/681 , G06F2212/682
Abstract: An apparatus and method are provided for maintaining address translation data within an address translation cache. The address translation cache has a plurality of entries, where each entry is used to store address translation data used when converting a virtual address into a corresponding physical address of a memory system. Control circuitry is used to perform an allocation process to determine the address translation data to be stored in each entry. The address translation cache is used to store address translation data of a plurality of different types representing address translation data specified at respective different levels of address translation within a multiple-level page table walk. The plurality of different types comprises a final level type of address translation data that identifies a full translation from the virtual address to the physical address, and at least one intermediate level type of address translation data that identifies a partial translation of the virtual address. The control circuitry is arranged, when performing the allocation process, to apply an allocation policy that permits each of the entries to be used for any of the different types of address translation data, and to store type identification data in association with each entry to enable the type of the address translation data stored therein to be determined. Such an approach enables very efficient usage of the address translation cache resources, for example by allowing the proportion of the entries used for full address translation data and the proportion of the entries used for partial address translation data to be dynamically adapted to changing workload conditions.
-
-
-
-
-
-
-
-
-