Array of Pointers Prefetching
    2.
    发明公开

    公开(公告)号:US20230305849A1

    公开(公告)日:2023-09-28

    申请号:US17704627

    申请日:2022-03-25

    CPC classification number: G06F9/3802 G06F9/30043

    Abstract: Array of pointers prefetching is described. In accordance with described techniques, a pointer target instruction is detected by identifying that a destination location of a load instruction is used in an address compute for a memory operation and the load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction for fetching data of a future load instruction is injected in an instruction stream of a processor. The data of the future load instruction is stored in a temporary register. An additional instruction is injected in the instruction stream for prefetching a pointer target based on an address of the memory operation and the data of the future load instruction.

    Load Dependent Branch Prediction
    3.
    发明公开

    公开(公告)号:US20230297381A1

    公开(公告)日:2023-09-21

    申请号:US17699855

    申请日:2022-03-21

    CPC classification number: G06F9/3806 G06F9/30043

    Abstract: Load dependent branch prediction is described. In accordance with described techniques, a load dependent branch instruction is detected by identifying that a destination location of a load instruction is used in an operation for determining whether a conditional branch is taken or not taken. The load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction is injected in an instruction stream of a processor for fetching data of a future load instruction using an address of the load instruction offset by a distance based on the step size. An additional instruction is injected in the instruction stream of the processor for precomputing an outcome of a load dependent branch using an address computed based on an address of the operation and the data of the future load instruction.

    Multi-class multi-label classification using clustered singular decision trees for hardware adaptation

    公开(公告)号:US11455252B2

    公开(公告)日:2022-09-27

    申请号:US16454027

    申请日:2019-06-26

    Abstract: Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores for the associated hybrid prefetcher configuration based on hardware feature measurements. To decide on a hybrid prefetcher configuration to use, a prefetcher traverses multiple trees to obtain certainty scores for different hybrid prefetcher configurations and identifies a hybrid prefetcher configuration to used based on a comparison of the certainty scores.

    SCHEDULER QUEUE ASSIGNMENT BURST MODE

    公开(公告)号:US20210173702A1

    公开(公告)日:2021-06-10

    申请号:US16709527

    申请日:2019-12-10

    Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.

    Dynamic evaluation and reconfiguration of a data prefetcher
    6.
    发明授权
    Dynamic evaluation and reconfiguration of a data prefetcher 有权
    数据预取器的动态评估和重新配置

    公开(公告)号:US09058277B2

    公开(公告)日:2015-06-16

    申请号:US13671801

    申请日:2012-11-08

    Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.

    Abstract translation: 提供了用于为处理器预取数据的方法和系统。 系统被配置用于并且方法包括选择处理器的第一预取控制逻辑和第二预取控制逻辑之一作为候选特征,当候选特征不活动时,在非活动采样周期捕获处理器的性能度量, 当候选特征处于活动状态时,在活动采样周期捕获处理器的性能度量,比较处于活动和非活动采样周期的处理器的性能度量,并且将候选特征的状态设置为使能时的性能度量 活动期间表示在非活动期间的性能指标改善,当非活动期间的性能指标表示改善了活动期间的绩效指标时被禁用。

    Data Reuse Cache
    7.
    发明公开
    Data Reuse Cache 审中-公开

    公开(公告)号:US20240111674A1

    公开(公告)日:2024-04-04

    申请号:US17955618

    申请日:2022-09-29

    CPC classification number: G06F12/0811 G06F12/0875 G06F12/0884

    Abstract: Data reuse cache techniques are described. In one example, a load instruction is generated by an execution unit of a processor unit. In response to the load instruction, data is loaded by a load-store unit for processing by the execution unit and is also stored to a data reuse cache communicatively coupled between the load-store unit and the execution unit. Upon receipt of a subsequent load instruction for the data from the execution unit, the data is loaded from the data reuse cache for processing by the execution unit.

    SCHEDULER QUEUE ASSIGNMENT
    8.
    发明申请

    公开(公告)号:US20220206798A1

    公开(公告)日:2022-06-30

    申请号:US17698955

    申请日:2022-03-18

    Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.

    Scheduler queue assignment
    9.
    发明授权

    公开(公告)号:US11294678B2

    公开(公告)日:2022-04-05

    申请号:US15991088

    申请日:2018-05-29

    Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.

Patent Agency Ranking