Supporting binary translation alias detection in an out-of-order processor

    公开(公告)号:US10228956B2

    公开(公告)日:2019-03-12

    申请号:US15282266

    申请日:2016-09-30

    Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.

    Instruction and Logic for Dynamic Store Elimination

    公开(公告)号:US20180074827A1

    公开(公告)日:2018-03-15

    申请号:US15265587

    申请日:2016-09-14

    Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.

    Auxiliary Cache for Reducing Instruction Fetch and Decode Bandwidth Requirements

    公开(公告)号:US20170286110A1

    公开(公告)日:2017-10-05

    申请号:US15087786

    申请日:2016-03-31

    Abstract: A hardware-software co-designed processor includes a front end to decode an instruction, an execution unit to execute the instruction, an auxiliary cache to store auxiliary information for consumption during execution of the instruction, an instruction blender, and a retirement unit to retire the instruction. The auxiliary information may include long immediate values, non-working instructions for emulating an untranslated instruction stream, or execution hints, and is not decoded by the front end. The auxiliary cache includes circuitry to receive the auxiliary information from a binary translator, to store the auxiliary information in the auxiliary cache, and to provide the auxiliary information to the instruction blender prior to execution. The instruction blender includes circuitry to receive the auxiliary information, to blend the instruction with the auxiliary information, and to provide the blended instruction to the execution unit. Use of the auxiliary cache may reduce fetch and decode bandwidth requirements.

    System, method, and apparatus for enhanced pointer identification and prefetching

    公开(公告)号:US12216581B2

    公开(公告)日:2025-02-04

    申请号:US18320780

    申请日:2023-05-19

    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.

    SYSTEM, METHOD, AND APPARATUS FOR ENHANCED POINTER IDENTIFICATION AND PREFETCHING

    公开(公告)号:US20210365377A1

    公开(公告)日:2021-11-25

    申请号:US17391962

    申请日:2021-08-02

    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.

    System, method, and apparatus for enhanced pointer identification and prefetching

    公开(公告)号:US11080194B2

    公开(公告)日:2021-08-03

    申请号:US16234135

    申请日:2018-12-27

    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.

    Hybrid atomicity support for a binary translation based microprocessor

    公开(公告)号:US10296343B2

    公开(公告)日:2019-05-21

    申请号:US15474666

    申请日:2017-03-30

    Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.

Patent Agency Ranking