-
1.
公开(公告)号:US20190042224A1
公开(公告)日:2019-02-07
申请号:US16128275
申请日:2018-09-11
Applicant: Intel Corporation
Inventor: Diego Luis Caballero de Gea , Hideki Ido , Eric N. Garcia
IPC: G06F8/41
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to improve loop optimization with predictable recurring memory reads (PRMRs). An example apparatus includes an optimizer including an optimization scenario manager to generate an optimization plan associated with a loop and corresponding optimization parameters, the optimization plan including a set of one or more optimizations, an optimization scenario analyzer to identify the optimization plan as a candidate optimization plan when a quantity of PRMRs included in the loop is greater than a threshold, and a parameter calculator to determine the optimization parameters based on the candidate optimization plan, and a code generator to generate instructions to be executed by a processor, the instructions based on processing the loop with the one or more optimizations included in the candidate optimization plan.
-
公开(公告)号:US20210397454A1
公开(公告)日:2021-12-23
申请号:US16905914
申请日:2020-06-18
Applicant: Intel Corporation
Inventor: Mikhail Plotnikov , Hideki Ido , Ilya Burylov , Ruslan Arutyunyan
Abstract: Methods and apparatus relating to techniques for vectorizing loops with backward cross-iteration dependencies are described. In an embodiment, execution of one or more instructions resolves a cross-iteration dependency of one or more operations of a loop. The execution of the one or more instructions resolves the cross-iteration dependency of the one or more operations based at least in part on one or more distance count computations to a preceding iteration of the loop. Other embodiments are also disclosed and claimed.
-
3.
公开(公告)号:US20210034344A1
公开(公告)日:2021-02-04
申请号:US17074336
申请日:2020-10-19
Applicant: Intel Corporation
Inventor: Diego Luis Caballero de Gea , Hideki Ido , Eric N. Garcia
IPC: G06F8/41
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to improve loop optimization with predictable recurring memory reads (PRMRs). An example apparatus includes memory, and first processor circuitry to execute first instructions to at least identify one or more optimizations to convert a first loop into a second loop based on converting PRMRs of the first loop into loop-invariant PRMRs, the converting of the PRMRs in response to a quantity of the PRMRs satisfying a threshold, the second loop to execute in a single iteration corresponding to a quantity of iterations of the first loop, determine one or more optimization parameters based on the one or more optimizations, and compile second instructions based on the first processor circuitry processing the first loop based on the one or more optimization parameters associated with the one or more optimizations, the second instructions to be executed by the first or second processor circuitry.
-
公开(公告)号:US10303525B2
公开(公告)日:2019-05-28
申请号:US14582717
申请日:2014-12-24
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Christopher J. Hughes , Robert Valentine , Milind B. Girkar , Hideki Ido , Youfeng Wu , Cheng Wang
Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode and an operand to store a portion of a fallback address, execution hardware to execute the decoded instruction to initiate a data speculative execution (DSX) region by activating DSX tracking hardware to track speculative memory accesses and detect ordering violations in the DSX region, and storing the fallback address.
-
公开(公告)号:US11853757B2
公开(公告)日:2023-12-26
申请号:US16811011
申请日:2020-03-06
Applicant: Intel Corporation
Inventor: Ilya Burylov , Mikhail Plotnikov , Hideki Ido , Ruslan Arutyunyan
CPC classification number: G06F9/30036 , G06F8/4441 , G06F8/4452 , G06F9/30018 , G06F9/30065 , G06F9/321
Abstract: Systems, apparatuses and methods may provide for technology that identifies that an iterative loop includes a first code portion that executes in response to a condition being satisfied, generates a first vector mask that is to represent one or more instances of the condition being satisfied for one or more values of a first vector of values, and one or more instances of the condition being unsatisfied for the first vector of values, where the first vector of values is to correspond to one or more first iterations of the iterative loop, and conducts a vectorization process of the iterative loop based on the first vector mask.
-
公开(公告)号:US10776093B2
公开(公告)日:2020-09-15
申请号:US16304644
申请日:2016-07-01
Applicant: Intel Corporation
Inventor: Mikhail Plotnikov , Hideki Ido , Xinmin Tian , Sergey Preis , Milind B. Girkar , Maxim Shutov
Abstract: Methods, apparatus, and system to optimize compilation of source code into vectorized compiled code, notwithstanding the presence of output dependencies which might otherwise preclude vectorization.
-
公开(公告)号:US09921966B2
公开(公告)日:2018-03-20
申请号:US14273649
申请日:2014-05-09
Applicant: INTEL CORPORATION
Inventor: Rakesh Krishnaiyer , Serge Preis , Hideki Ido , Anatoly Zvezdin
IPC: G06F12/08 , G06F12/0862
CPC classification number: G06F12/0862 , G06F2212/602 , G06F2212/621
Abstract: The present application is directed to employing prefetch to reduce write overhead. A device may comprise a processor and a cache memory. The processor may determine if data to be written to the cache memory comprises multiple cache lines wherein at least one of the cache lines will be fully written. If the data comprises at least one cache line to be fully written, then the processor may perform a “prefetch” wherein the processor may write dummy data to sections of the cache memory corresponding to the data to be written in full cache lines. The processor may then write actual data to the sections containing the dummy data without the processor first having to verify ownership of the sections. Any remaining data that will not be written in full cache lines may then be written to the cache memory utilizing a standard write transaction.
-
8.
公开(公告)号:US20240202002A1
公开(公告)日:2024-06-20
申请号:US18067577
申请日:2022-12-16
Applicant: INTEL CORPORATION
Inventor: Hideki Ido
CPC classification number: G06F9/3842 , G06F9/30145
Abstract: Techniques for implementing a branch instruction having a misprediction handling hint to prevent instructions on a mispredicted path from getting cancelled are described. In certain examples, a hardware processor core includes a retirement circuit; a branch predictor circuit to predict a predicted path for a branch, and cause a speculative processing of the predicted path; a decode circuit to decode a single instruction into a decoded instruction, the single instruction having a field to indicate the retirement circuit is to allow retirement of the predicted path for the branch that is a misprediction; and an execution circuit to execute the decoded instruction to cause: the retirement circuit to allow the retirement of the predicted path that is the misprediction for the branch when the field is set to a first value, and the retirement circuit to disallow the retirement of the predicted path that is the misprediction for the branch when the field is otherwise.
-
9.
公开(公告)号:US11442713B2
公开(公告)日:2022-09-13
申请号:US17074336
申请日:2020-10-19
Applicant: Intel Corporation
Inventor: Diego Luis Caballero de Gea , Hideki Ido , Eric N. Garcia
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to improve loop optimization with predictable recurring memory reads (PRMRs). An example apparatus includes memory, and first processor circuitry to execute first instructions to at least identify one or more optimizations to convert a first loop into a second loop based on converting PRMRs of the first loop into loop-invariant PRMRs, the converting of the PRMRs in response to a quantity of the PRMRs satisfying a threshold, the second loop to execute in a single iteration corresponding to a quantity of iterations of the first loop, determine one or more optimization parameters based on the one or more optimizations, and compile second instructions based on the first processor circuitry processing the first loop based on the one or more optimization parameters associated with the one or more optimizations, the second instructions to be executed by the first or second processor circuitry.
-
10.
公开(公告)号:US10853043B2
公开(公告)日:2020-12-01
申请号:US16128275
申请日:2018-09-11
Applicant: Intel Corporation
Inventor: Diego Luis Caballero de Gea , Hideki Ido , Eric N. Garcia
IPC: G06F8/41
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to improve loop optimization with predictable recurring memory reads (PRMRs). An example apparatus includes an optimizer including an optimization scenario manager to generate an optimization plan associated with a loop and corresponding optimization parameters, the optimization plan including a set of one or more optimizations, an optimization scenario analyzer to identify the optimization plan as a candidate optimization plan when a quantity of PRMRs included in the loop is greater than a threshold, and a parameter calculator to determine the optimization parameters based on the candidate optimization plan, and a code generator to generate instructions to be executed by a processor, the instructions based on processing the loop with the one or more optimizations included in the candidate optimization plan.
-
-
-
-
-
-
-
-
-