APPLICATION PROGRAMMING INTERFACE FOR FINE GRAINED LOW LATENCY DECOMPRESSION WITHIN PROCESSOR CORE

    公开(公告)号:EP4020230A1

    公开(公告)日:2022-06-29

    申请号:EP21197700.4

    申请日:2021-09-20

    申请人: INTEL Corporation

    IPC分类号: G06F12/0886

    摘要: Methods and apparatus relating to an Application Programming Interface (API) for fine grained low latency decompression within a processor core are described. In an embodiment, a decompression Application Programming Interface (API) receives an input handle to a data object. The data object includes compressed data and metadata. Decompression Engine (DE) circuitry decompresses the compressed data to generate uncompressed data. The DE circuitry decompress the compressed data in response to invocation of a decompression instruction by the decompression API. The metadata comprises a first operand to indicate a location of the compressed data, a second operand to indicate a size of the compressed data, a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored, and a fourth operand to indicate a size of the decompressed data. Other embodiments are also disclosed and claimed.

    DEVICE, METHOD, AND SYSTEM TO FACILITATE IMPROVED BANDWIDTH OF A BRANCH PREDICTION UNIT

    公开(公告)号:EP4202661A1

    公开(公告)日:2023-06-28

    申请号:EP22203532.1

    申请日:2022-10-25

    申请人: Intel Corporation

    IPC分类号: G06F9/38

    摘要: Techniques and mechanisms for a processor to determine an execution of instructions based on a prediction of a taken branch. In an embodiment, a first prediction unit generates each of multiple branch predictions in one cycle of successive branch prediction cycles. An indication of the branch predictions is provided to an execution pipeline, which prepares to execute an instruction based on the indication. Where a first one of the branch predictions is determined to be of a low confidence type, said first branch prediction is further indicated to a second prediction unit, which performs a second branch prediction based on the same branch instruction for which the first branch prediction was made. In another embodiment, the second prediction unit signals that a state of the execution pipeline is to be cleared, based on a determination that the first and second branch predictions are inconsistent with each other.

    DEVICE, METHOD AND SYSTEM TO PROVIDE A PREDICTED VALUE WITH A SEQUENCE OF MICRO-OPERATIONS

    公开(公告)号:EP4202652A1

    公开(公告)日:2023-06-28

    申请号:EP22205225.0

    申请日:2022-11-03

    申请人: INTEL Corporation

    IPC分类号: G06F9/30 G06F9/38

    摘要: Techniques and mechanisms for efficiently making value prediction information available for use by in a processor. In an embodiment, the instruction execution is to include a loading of some data to a first location (e.g., a first register). A decoder of the processor accesses reference information which indicates that the execution is to comprise multiple micro-operations (µops) including a LoadCheck µop and a Move µop. The LoadCheck µop loads a first value to the first location, and checks whether the loaded first value is the same as a previously-determined second value which represents a prediction of what the first value would be. The Move µop moves the second value to the first location. In another embodiment, the Move µop is scheduled for execution out-of-order with respect to the LoadCheck µοp, resulting in an early availability of the second value for access in a register file by another µop.

    TECHNOLOGY FOR DYNAMICALLY TUNING PROCESSOR FEATURES

    公开(公告)号:EP4075280A1

    公开(公告)日:2022-10-19

    申请号:EP22174476.6

    申请日:2020-05-20

    申请人: Intel Corporation

    IPC分类号: G06F11/34 G06F12/0862

    摘要: Disclosed is a processor with a first cache, a second cache coupled to the first cache, an arithmetic logic unit (ALU) to perform arithmetic operations, and a circuit coupled to the ALU. After the processor has executed a workload for a first execution window with a microarchitectural feature disabled and for a second execution window with the microarchitectural feature enabled, the circuit is to: determine whether the processor achieved worse performance in the second execution window, relative to the first execution window; and in response to a determination that the processor achieved the worse performance in the second execution window, update a state for an address associated with an instruction towards a bad final state, wherein when the state for the address reaches the bad final state, the processor is to disable the microarchitectural feature for the address associated with the instruction. Other embodiments are described and claimed.

    SPECULATIVE DECOMPRESSION WITHIN PROCESSOR CORE CACHES

    公开(公告)号:EP4020231A1

    公开(公告)日:2022-06-29

    申请号:EP21198841.5

    申请日:2021-09-24

    申请人: Intel Corporation

    IPC分类号: G06F12/0886 G06F9/30 G06F9/38

    摘要: Methods and apparatus relating to speculative decompression within processor core caches are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into a plurality of cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the plurality of cachelines of the cache of the processor core in response to the second micro operation. The decompression instruction causes the DE circuitry to perform an out-of-order decompression of the plurality of cachelines. Other embodiments are also disclosed and claimed.

    TECHNOLOGY FOR DYNAMICALLY TUNING PROCESSOR FEATURES

    公开(公告)号:EP3796177A1

    公开(公告)日:2021-03-24

    申请号:EP20175698.8

    申请日:2020-05-20

    申请人: Intel Corporation

    IPC分类号: G06F11/34 G06F12/0862

    摘要: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.