-
公开(公告)号:US20210349823A1
公开(公告)日:2021-11-11
申请号:US16870358
申请日:2020-05-08
Applicant: Apple Inc.
Inventor: Michael L. Karm , Gideon N. Levinsky
IPC: G06F12/0815
Abstract: In one embodiment, a processor includes a write combining buffer that includes a memory having a plurality of entries. The entries may be allocated to committed store operations transmitted by a load/store unit in the processor, and subsequent committed store operations may merge data with previous store memory operations in the buffer if the subsequent committed store operations are to addresses that match addresses of the previous committed store operations within a predefined granularity (e.g. the width of a cache port). The write combining buffer may be configured to retain up to N entries of committed store operations, but may also be configured to write one or more of the entries to the data cache responsive to receiving more than a threshold amount of non-merging committed store operations in the write combining buffer.
-
公开(公告)号:US11256622B2
公开(公告)日:2022-02-22
申请号:US16870358
申请日:2020-05-08
Applicant: Apple Inc.
Inventor: Michael L. Karm , Gideon N. Levinsky
IPC: G06F12/08 , G06F12/0815 , G06F12/0811
Abstract: In one embodiment, a processor includes a write combining buffer that includes a memory having a plurality of entries. The entries may be allocated to committed store operations transmitted by a load/store unit in the processor, and subsequent committed store operations may merge data with previous store memory operations in the buffer if the subsequent committed store operations are to addresses that match addresses of the previous committed store operations within a predefined granularity (e.g. the width of a cache port). The write combining buffer may be configured to retain up to N entries of committed store operations, but may also be configured to write one or more of the entries to the data cache responsive to receiving more than a threshold amount of non-merging committed store operations in the write combining buffer.
-
公开(公告)号:US11119767B1
公开(公告)日:2021-09-14
申请号:US16906396
申请日:2020-06-19
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
4.
公开(公告)号:US12229557B2
公开(公告)日:2025-02-18
申请号:US18601640
申请日:2024-03-11
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
5.
公开(公告)号:US20240248717A1
公开(公告)日:2024-07-25
申请号:US18601640
申请日:2024-03-11
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
CPC classification number: G06F9/30043 , G06F9/3004 , G06F9/30087 , G06F9/321 , G06F9/3826 , G06F9/3834 , G06F9/3842 , G06F9/528 , G06F2209/521
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
公开(公告)号:US20150293577A1
公开(公告)日:2015-10-15
申请号:US14251508
申请日:2014-04-11
Applicant: Apple Inc.
Inventor: Ronald P. Hall , Michael L. Karm , Ian D. Kountanis , David J. Williamson
CPC classification number: G06F1/3234 , G06F9/30058 , G06F9/30065 , G06F9/325 , G06F9/381 , G06F9/3844
Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).
Abstract translation: 公开了在执行指令循环期间降低功率的技术。 处理器可以使用多种不同的功率节省模式,例如在更多数量的循环迭代之后仅仅几次循环迭代(例如2-3)和第二更深的省电模式之后的第一省电模式。 第一省电模式可以包括保持分支预测器和/或其他结构是有效的,但是第二省电模式可以包括降低分支预测器和/或其他结构的功率。 在进入用于循环执行的省电模式之前,处理器还可以使用观察模式和指令捕获模式。 在执行具有多个后向分支(例如,嵌套循环)的复杂环路时也可以实现节电模式。
-
公开(公告)号:US20150205725A1
公开(公告)日:2015-07-23
申请号:US14160242
申请日:2014-01-21
Applicant: APPLE INC.
Inventor: Muawya M. Al-Otoom , Ian D. Kountanis , Ronald P. Hall , Michael L. Karm
CPC classification number: G06F9/3844 , G06F9/3808 , G06F9/381 , G06F9/3867 , G06F12/0862 , Y02D10/13
Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.
Abstract translation: 公开了关于指令模式的缓存的技术。 在一些实施例中,装置包括指令高速缓存,并且被配置为通过指令处理流水线来检测指令的执行模式。 执行模式可能涉及仅在特定指令组中执行指令。 该指令可以包括多次后向控制传送和/或在该模式的一次迭代中采取的控制传送指令,而不是在该模式的另一次迭代中进行。 该设备可以被配置为将指令存储在指令高速缓存中并且从指令高速缓存中取出并执行指令。 该装置可以包括专用于预测指令高速缓存的控制传送指令的方向的分支预测器。 各种实施例可以减少与指令处理相关联的功耗。
-
8.
公开(公告)号:US11928467B2
公开(公告)日:2024-03-12
申请号:US17473076
申请日:2021-09-13
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
CPC classification number: G06F9/30043 , G06F9/3004 , G06F9/30087 , G06F9/321 , G06F9/3826 , G06F9/3834 , G06F9/3842 , G06F9/528 , G06F2209/521
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
公开(公告)号:US20220091846A1
公开(公告)日:2022-03-24
申请号:US17473076
申请日:2021-09-13
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
公开(公告)号:US09632791B2
公开(公告)日:2017-04-25
申请号:US14160242
申请日:2014-01-21
Applicant: Apple Inc.
Inventor: Muawya M. Al-Otoom , Ian D. Kountanis , Ronald P. Hall , Michael L. Karm
IPC: G06F12/08 , G06F9/38 , G06F12/0862
CPC classification number: G06F9/3844 , G06F9/3808 , G06F9/381 , G06F9/3867 , G06F12/0862 , Y02D10/13
Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.
-
-
-
-
-
-
-
-
-