Patent search ap:("INTEL CORPORATION") AND inv:"Jayesh Gaur" Page 2

11.

发明授权
Technology for dynamically tuning processor features 有权

公开(公告)号：US11656971B2

公开(公告)日：2023-05-23

申请号：US17582051

申请日：2022-01-24

Applicant: Intel Corporation

Inventor： Adarsh Chauhan , Jayesh Gaur , Franck Sala , Lihu Rappoport , Zeev Sperber , Adi Yoaz , Sreenivas Subramoney

IPC: G06F11/34 , G06F11/30 , G06F15/78 , G06F9/24 , G06F9/38

CPC classification number: G06F11/3476 , G06F9/24 , G06F9/3836 , G06F11/3024 , G06F11/3055 , G06F15/7875

Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.

12.

发明申请
Technology For Dynamically Tuning Processor Features 有权

公开(公告)号：US20220206925A1

公开(公告)日：2022-06-30

申请号：US17582051

申请日：2022-01-24

Applicant: Intel Corporation

Inventor： Adarsh Chauhan , Jayesh Gaur , Franck Sala , Lihu Rappoport , Zeev Sperber , Adi Yoaz , Sreenivas Subramoney

IPC: G06F11/34 , G06F9/24 , G06F9/38 , G06F11/30 , G06F15/78

Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.

13.

发明申请
APPLICATION PROGRAMMING INTERFACE FOR FINE GRAINED LOW LATENCY DECOMPRESSION WITHIN PROCESSOR CORE 有权

公开(公告)号：US20220197813A1

公开(公告)日：2022-06-23

申请号：US17133624

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Jayesh Gaur , Adarsh Chauhan , Vinodh Gopal , Vedvyas Shanbhogue , Sreenivas Subramoney , Wajdi Feghali

IPC: G06F12/0875 , G06F12/0813 , G06F12/0811 , G06F12/1045

Abstract: Methods and apparatus relating to techniques for increasing per core memory bandwidth by using forget store operations are described. In an embodiment, a cache stores a buffer. Execution circuitry executes an instruction. The instruction causes one or more cachelines in the cache to be marked based on a start address for the buffer and a size of the buffer. A marked cacheline in the cache is to be prevented from being written back to memory. Other embodiments are also disclosed and claimed.

14.

发明授权
Technology for dynamically tuning processor features 有权

公开(公告)号：US10915421B1

公开(公告)日：2021-02-09

申请号：US16575535

申请日：2019-09-19

Applicant: Intel Corporation

Inventor： Adarsh Chauhan , Jayesh Gaur , Franck Sala , Lihu Rappoport , Zeev Sperber , Adi Yoaz , Sreenivas Subramoney

IPC: G06F11/34 , G06F15/78 , G06F11/30 , G06F9/38 , G06F9/24

Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.

15.

发明授权
System, apparatus and method for prefetch-aware replacement in a cache memory hierarchy of a processor 有权

公开(公告)号：US10268600B2

公开(公告)日：2019-04-23

申请号：US15701795

申请日：2017-09-12

Applicant: Intel Corporation

Inventor： Jayesh Gaur , Sreenivas Subramoney , Sanjay Ganapathy

IPC: G06F12/0862 , G06F12/126 , G06F12/128 , G06F12/0811

Abstract: In one embodiment, a processor includes: a first cache controller to control a first cache memory. This cache controller may include a replacement circuit to: associate a first priority indicator with a first cache line based on storage of demand data in the first cache line and first learning information associated with a set of demand-based categories of cache lines; and associate a second priority indicator with a second cache line based on storage of prefetch data in the second cache line and second learning information associated with a set of prefetch-based categories of cache lines. Other embodiments are described and claimed.

16.

发明申请
EFFICIENT HARDWARE-BASED EXTRACTION OF PROGRAM INSTRUCTIONS FOR CRITICAL PATHS 审中-公开

公开(公告)号：US20180232235A1

公开(公告)日：2018-08-16

申请号：US15433674

申请日：2017-02-15

Applicant: INTEL CORPORATION

Inventor： Jayesh Gaur , Pooja Roy , Sreenivas Subramoney , Hong Wang , Ronak Singhal

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3838 , G06F8/41

Abstract: A processor includes a memory to hold a buffer to store data dependencies comprising nodes and edges for each of a plurality of micro-operations. The nodes include a first node for dispatch, a second node for execution, and a third node for commit. A detector circuit is to queue, in the buffer, the nodes of a micro-operation; add, to determine a node weight for each of the nodes of the micro-operation, an edge weight to a previous node weight of a connected micro-operation that yields a maximum node weight for the node, wherein the node weight comprises a number of execution cycles of an OOO pipeline of the processor and the edge weight comprises a number of execution cycles to execute the connected micro-operation; and identify, as a critical path, a path through the data dependencies that yields the maximum node weight for the micro-operation.

17.

发明申请
MEMORY AWARE REORDERED SOURCE 审中-公开

公开(公告)号：US20180181329A1

公开(公告)日：2018-06-28

申请号：US15392638

申请日：2016-12-28

Applicant: Intel Corporation

Inventor： Ishwar S. Bhati , Udit Dhawan , Jayesh Gaur , Sreenivas Subramoney

IPC: G06F3/06 , G06F12/0897

CPC classification number: G06F12/0897 , G06F9/3824 , G06F12/0811 , G06F2212/1041 , G06F2212/1056 , G06F2212/302

Abstract: Processor, apparatus, and method for reordering a stream of memory access requests to establish locality are described herein. One embodiment of a method includes: storing in a request queue memory access requests generated by a plurality of execution units, the memory access requests comprising a first request to access a first memory page in a memory and a second request to access a second memory page in the memory; maintaining a list of unique memory pages, each unique memory page associated with one or more memory access requests stored the request queue and is to be accessed by the one or more memory access requests; selecting a current memory page from the list of unique memory pages; and dispatching from the request queue to the memory, all memory access requests associated with the current memory page before any other memory access request in the request queue is dispatched.

18.

发明申请
Data Compression In Processor Caches 有权
Title translation: 处理器缓存中的数据压缩

公开(公告)号：US20150089126A1

公开(公告)日：2015-03-26

申请号：US14036673

申请日：2013-09-25

Applicant: Intel Corporation

Inventor： Sreenivas Subramoney , Jayesh Gaur , Alaa R. Alameldeen

IPC: G06F12/08

CPC classification number: G06F12/126 , G06F12/0895

Abstract: In an embodiment, a processor includes a cache data array including a plurality of physical ways, each physical way to store a baseline way and a victim way; a cache tag array including a plurality of tag groups, each tag group associated with a particular physical way and including a first tag associated with the baseline way stored in the particular physical way, and a second tag associated with the victim way stored in the particular physical way; and cache control logic to: select a first baseline way based on a replacement policy, select a first victim way based on an available capacity of a first physical way including the first victim way, and move a first data element from the first baseline way to the first victim way. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，处理器包括高速缓存数据阵列，其包括多个物理方式，每种物理方式来存储基线方式和受害方式; 包括多个标签组的缓存标签阵列，与特定物理方式相关联的每个标签组，并且包括与以特定物理方式存储的基线方式相关联的第一标签，以及与存储在特定物理方式中的受害方式相关联的第二标签物理方式以及高速缓存控制逻辑，以：基于替换策略选择第一基线方式，基于包括所述第一受害者方式的第一物理方式的可用容量选择第一受害者方式，并将第一数据元素从所述第一基线方式移动到第一个受害者的方式。描述和要求保护其他实施例。

19.

发明公开
SHORT PIPELINE FOR FAST RECOVERY FROM A BRANCH MISPREDICTION 审中-公开

公开(公告)号：US20240103878A1

公开(公告)日：2024-03-28

申请号：US17953184

申请日：2022-09-26

Applicant: Intel Corporation

Inventor： Jayesh Gaur , Sufiyan Syed , Adithya Ranganathan , Sreenivas Subramoney

IPC: G06F9/38

CPC classification number: G06F9/3861 , G06F9/3867 , G06F9/3806

Abstract: An example of an integrated circuit may include a first execution cluster, a second execution cluster that is one or more of narrower and shallower as compared to the first execution cluster, and circuitry to selectively steer instructions to the first execution cluster and the second execution cluster based on branch misprediction information. Other embodiments are disclosed and claimed.

20.

发明公开
INSTRUCTION ELIMINATION THROUGH HARDWARE DRIVEN MEMOIZATION OF LOOP INSTANCES 审中-公开

公开(公告)号：US20240103874A1

公开(公告)日：2024-03-28

申请号：US17951859

申请日：2022-09-23

Applicant: Intel Corporation

Inventor： Niranjan Kumar Soundararajan , Sreenivas Subramoney , Jayesh Gaur

IPC: G06F9/38 , G06F9/30 , G06F9/32

CPC classification number: G06F9/381 , G06F9/30065 , G06F9/325

Abstract: Methods and apparatus for instruction elimination through hardware driven memoization of loop instances. A hardware-based loop memoization technique learns repeating sequences of loops and transparently removes instructions for the loop instructions from instruction sequences while making their output available to dependent instructions as if the loop instructions had been executed. A path-based predictor is implemented at the front-end to predict these loop instances and remove their instructions from instruction sequences. A novel memoization prediction micro-operation (Uop) is inserted into the instruction sequence for instances of loops that are predicted to be memoized. The memoization prediction Uop is used to compare the input signature (expected set of input values for the loop) with the actual signature to determine correct and incorrect predictions. The input signature learnt is based on all live-ins of a loop, both explicit register-based live-ins as well as loads to memory in the loop body that determine code path and outputs.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification