-
公开(公告)号:US10061587B2
公开(公告)日:2018-08-28
申请号:US14496113
申请日:2014-09-25
Applicant: Intel Corporation
Inventor: David Pardo Keppel , Denis M. Khartikov , Fernando LaTorre , Marc Lupon , Grigorios Magklis , Naveen Neelakantam , Georgios Tournavitis , Polychronis Xekalakis
CPC classification number: G06F9/30185 , G06F9/384 , G06F9/3857
Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.
-
公开(公告)号:US09823938B2
公开(公告)日:2017-11-21
申请号:US14742908
申请日:2015-06-18
Applicant: Intel Corporation
CPC classification number: G06F9/45516 , G06F9/30 , G06F11/00
Abstract: In one embodiment, a processor includes a front end unit to fetch and decode an instruction. The front end unit includes a first random number generator to generate a random value responsive to a profileable event associated with the instruction. The processor further includes a profile logic to collect profile information associated with the instruction responsive to a sample signal, where the sample signal is based on at least a portion of the random value. Other embodiments are described and claimed.
-
13.
公开(公告)号:US10545735B2
公开(公告)日:2020-01-28
申请号:US15813021
申请日:2017-11-14
Applicant: Intel Corporation
Inventor: Polychronis Xekalakis , Jason M. Agron
Abstract: An apparatus and method for a dual return stack buffer (RSB) for use in binary translation systems. For example, one embodiment of a processor comprises: a dual return stack buffer (DRSB) comprising a native RSB and an extended RSB (XRSB), the dual RSB to be used within a binary translation execution environment in which guest call-return instruction sequences are translated to native call-return instruction sequences to be executed directly by the processor; the native RSB to store native return addresses associated with the native call-return instruction sequences; and the XRSB to store emulated return addresses associated with the guest call-return instruction sequences, wherein each native return address stored in the RSB is associated with an emulated return address stored in the XRSB.
-
公开(公告)号:US09612840B2
公开(公告)日:2017-04-04
申请号:US14228690
申请日:2014-03-28
Applicant: Intel Corporation
Inventor: Denis M. Khartikov , Naveen Neelakantam , John H. Kelm , Polychronis Xekalakis
CPC classification number: G06F9/3802 , G06F9/30145 , G06F9/3836 , G06F9/3853 , G06F9/3855 , G06F9/3891 , G06F9/46
Abstract: A hardware/software co-design for an optimized dynamic out-of-order Very Long Instruction Word (VLIW) pipeline. For example, one embodiment of an apparatus comprises: an instruction fetch unit to fetch Very Long Instruction Words (VLIWs) in their program order from memory, each of the VLIWs comprising a plurality of reduced instruction set computing (RISC) instruction syllables grouped into the VLIWs in an order which removes data-flow dependencies and false output dependencies between the syllables; a decode unit to decode the VLIWs in their program order and output the syllables of each decoded VLIW in parallel; and an out-of-order execution engine to execute the syllables preferably in parallel with other syllables, wherein at least some of the syllables are to be executed in a different order than the order in which they are received from the decode unit, the out-of-order execution engine having one or more processing stages which do not check for data-flow dependencies and false output dependencies between the syllables when performing operations.
-
15.
公开(公告)号:US20160092222A1
公开(公告)日:2016-03-31
申请号:US14496113
申请日:2014-09-25
Applicant: Intel Corporation
Inventor: David Pardo Keppel , Denis M. Khartikov , Fernando LaTorre , Marc Lupon , Grigorios Magklis , Naveen Neelakantam , Georgios Tournavitis , Polychronis Xekalakis
IPC: G06F9/30
CPC classification number: G06F9/30185 , G06F9/384 , G06F9/3857
Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.
Abstract translation: 处理器包括前端,解码器,分配器和退休单元。 解码器包括用于识别终点范围(EOLR)指示符的逻辑。 EOLR指示符指定体系结构寄存器和不使用体系结构寄存器的代码中的位置。 分配器包括基于EOLR指示器扫描架构寄存器到物理寄存器的映射的逻辑。 分配器还包括生成用于将体系结构寄存器与物理寄存器取消关联的请求的逻辑。 退休单位包括将架构寄存器与物理寄存器取消关联的逻辑。
-
公开(公告)号:US12248785B2
公开(公告)日:2025-03-11
申请号:US17062556
申请日:2020-10-03
Applicant: Intel Corporation
Inventor: Polychronis Xekalakis , Sumit Ahuja
Abstract: A processor includes a binary translator an a decoder. The binary translator includes logic to analyze a stream of atomic instructions, identify words by boundary bits in the atomic instructions, generate a mask to identify the words, and load the mask and the plurality of words into an instruction cache line. The words include atomic instructions. At least one word includes more than one atomic instruction. The decoder includes logic to apply the mask to identify a first word from the instruction cache line and decode the first word based upon the applied mask.
-
17.
公开(公告)号:US10387159B2
公开(公告)日:2019-08-20
申请号:US14614264
申请日:2015-02-04
Applicant: Intel Corporation
Inventor: Jason M Agron , Polychronis Xekalakis , Paul Caprioli , Jiwei Oliver Lu , Koichi Yamada
Abstract: Methods and apparatuses relate to emulating architectural performance monitoring in a binary translation system. In one embodiment, a processor includes an architectural performance counter to maintain an architectural value associated with instruction execution, a register to store the architectural value of the architectural performance counter, binary translation logic to embed an architectural value from the architectural performance counter into a stream of translated instructions having a transactional code region and to store the architectural value into the register, and an execution unit to execute the transactional code region of the stream of translated instructions. The binary translation logic is configured to add the architectural value from the register to the architectural performance counter upon completion of the transactional code region of the stream of translated instructions. In one embodiment, a binary translation system overcomes software incompatibilities by using microarchitectural support to transparently and accurately emulate architectural performance counter behavior.
-
公开(公告)号:US10338927B2
公开(公告)日:2019-07-02
申请号:US15477374
申请日:2017-04-03
Applicant: Intel Corporation
Inventor: Denis M. Khartikov , Naveen Neelakantam , John H. Kelm , Polychronis Xekalakis
Abstract: A hardware/software co-design for an optimized dynamic out-of-order Very Long Instruction Word (VLIW) pipeline. For example, one embodiment of an apparatus comprises: an instruction fetch unit to fetch Very Long Instruction Words (VLIWs) in their program order from memory, each of the VLIWs comprising a plurality of reduced instruction set computing (RISC) instruction syllables grouped into the VLIWs in an order which removes data-flow dependencies and false output dependencies between the syllables; a decode unit to decode the VLIWs in their program order and output the syllables of each decoded VLIW in parallel; and an out-of-order execution engine to execute the syllables preferably in parallel with other syllables, wherein at least some of the syllables are to be executed in a different order than the order in which they are received from the decode unit, the out-of-order execution engine having one or more processing stages which do not check for data-flow dependencies and false output dependencies between the syllables when performing operations.
-
公开(公告)号:US10324724B2
公开(公告)日:2019-06-18
申请号:US14971904
申请日:2015-12-16
Applicant: Intel Corporation
Inventor: Patrick P. Lai , Tyler N. Sondag , Sebastian Winkel , Polychronis Xekalakis , Ethan Schuchman , Jayesh Iyer
IPC: G06F9/30
Abstract: Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction, and a hardware execution unit to execute the decoded, single fused instruction.
-
-
-
-
-
-
-
-