Patent search ap:("INTEL CORPORATION") AND inv:"JOYDEEP RAY" Page 1

1.

发明申请
INSTRUCTION PREFETCH BASED ON THREAD DISPATCH COMMANDS 有权

公开(公告)号：US20220083339A1

公开(公告)日：2022-03-17

申请号：US17509726

申请日：2021-10-25

Applicant: Intel Corporation

Inventor： JAMES VALERIO , VASANTH RANGANATHAN , JOYDEEP RAY , PRADEEP RAMANI

IPC: G06F9/38 , G06T1/20 , G06F13/28

Abstract: A graphics processing device comprises a set of compute units to execute multiple threads of a workload, a cache coupled with the set of compute units, and a prefetcher to prefetch instructions associated with the workload. The prefetcher is configured to use a thread dispatch command that is used to dispatch threads to execute a kernel to prefetch instructions, parameters, and/or constants that will be used during execution of the kernel. Prefetch operations for the kernel can then occur concurrently with thread dispatch operations.

2.

发明申请
SCALAR CORE INTEGRATION 有权

公开(公告)号：US20210349848A1

公开(公告)日：2021-11-11

申请号：US17321885

申请日：2021-05-17

Applicant: Intel Corporation

Inventor： JOYDEEP RAY , ARAVINDH ANANTARAMAN , ABHISHEK R. APPU , ALTUG KOKER , ELMOUSTAPHA OULD-AHMED-VALL , VALENTIN ANDREI , SUBRAMANIAM MAIYURAN , NICOLAS GALOPPO VON BORRIES , VARGHESE GEORGE , MIKE MACPHERSON , BEN ASHBAUGH , MURALI RAMADOSS , VIKRANTH VEMULAPALLI , WILLIAM SADLER , JONATHAN PEARCE , SUNGYE KIM

IPC: G06F15/80 , G06F9/30 , G06F9/38 , G06T15/00

Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.

3.

发明申请
MECHANISM TO PARTITION A SHARED LOCAL MEMORY 有权

公开(公告)号：US20210191868A1

公开(公告)日：2021-06-24

申请号：US16724813

申请日：2019-12-23

Applicant: Intel Corporation

Inventor： JOYDEEP RAY , VASANTH RANGANATHAN , BEN ASHBAUGH , JAMES VALERIO

IPC: G06F12/0846 , G06F12/0837 , G06F12/084 , G06F9/50 , G06F9/38 , G06F9/30

Abstract: An apparatus to facilitate partitioning of local memory is disclosed. The apparatus includes a plurality of execution units to execute a plurality of execution threads, a memory coupled to share access between the plurality of execution units and partitioning hardware to partition the memory to be used as a cache and as shared local memory (SLM), wherein the partitioning hardware partitions the memory based on a quantity of the plurality of execution threads executing on the execution units that are active.

4.

发明申请
POWER SAVINGS FOR NEURAL NETWORK ARCHITECTURE WITH ZERO ACTIVATIONS DURING INFERENCE 审中-公开

公开(公告)号：US20190041961A1

公开(公告)日：2019-02-07

申请号：US16144538

申请日：2018-09-27

Applicant: Intel Corporation

Inventor： KINCHIT DESAI , SANJEEV JAHAGIRDAR , PRASOONKUMAR SURTI , JOYDEEP RAY

IPC: G06F1/32 , G06N3/08 , G06N3/04

Abstract: Embodiments are generally directed to providing power savings for a neural network architecture with zero activations during inference. An embodiment of an apparatus includes one or more processors including one or more processor cores; and a memory to store data for processing including neural network processing, wherein the apparatus to perform a fast clear operation to initialize activation buffers for a neural network by updating metadata to indicate zero values, the neural network including a plurality of layers, wherein the apparatus is to compare outputs for the neural network to the metadata values and to write an output to memory only if the output is non-zero.

5.

发明申请
APPARATUS AND METHOD FOR MEMORY MANAGEMENT IN A GRAPHICS PROCESSING ENVIRONMENT 审中-公开

公开(公告)号：US20180293183A1

公开(公告)日：2018-10-11

申请号：US15482690

申请日：2017-04-07

Applicant: Intel Corporation

Inventor： NIRANJAN L. COORAY , ABHISHEK R. APPU , ALTUG KOKER , JOYDEEP RAY , BALAJI VEMBU , PATTABHIRAMAN K , DAVID PUFFER , DAVID J. COWPERTHWAITE , RAJESH M. SANKARAN , SATYESHWAR SINGH , SAMEER KP , ANKUR N. SHAH , KUN TIAN

IPC: G06F13/16 , G06F13/40 , G06F12/1027 , G06F12/0802

CPC classification number: G06F13/16 , G06F12/0802 , G06F12/1009 , G06F12/1027 , G06F12/1036 , G06F13/4068 , G06F2212/1024 , G06F2212/302 , G06F2212/60 , G06F2212/68

Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.

6.

发明申请
INSTRUCTION PREFETCH BASED ON THREAD DISPATCH COMMANDS 有权

公开(公告)号：US20250077232A1

公开(公告)日：2025-03-06

申请号：US18882364

申请日：2024-09-11

Applicant: Intel Corporation

Inventor： JAMES VALERIO , VASANTH RANGANATHAN , JOYDEEP RAY , PRADEEP RAMANI

IPC: G06F9/38 , G06F13/28 , G06T1/20

Abstract: A graphics processing device is provided that includes a set of compute units to execute a workload, a cache coupled with the set of compute units, and circuitry coupled with the cache and the set of compute units. The circuitry is configured to, in response to a cache miss for the read from a first cache, broadcast an event within the graphics processor device to identify data associated with the cache miss, receive the event at a second compute unit in the set of compute units, and prefetch the data identified by the event into a second cache that is local to the second compute unit before an attempt to read the instruction or data by the second thread.

7.

发明申请
SHARING REGISTER FILE USAGE BETWEEN FUSED PROCESSING RESOURCES 有权

公开(公告)号：US20220206795A1

公开(公告)日：2022-06-30

申请号：US17569229

申请日：2022-01-05

Applicant: Intel Corporation

Inventor： SUBRAMANIAM MAIYURAN , VARGHESE GEORGE , JOYDEEP RAY , ASHUTOSH GARG , JORGE PARRA , SHUBH SHAH , SHUBRA MARWAHA

IPC: G06F9/30 , G06F9/50 , G06F17/16

Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a shared local memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive an instruction to initiate a matrix multiplication operation, write a first set of matrix data into a first set of registers, and share the first set of matrix data between the first processing resource and the second processing resource for use in the matrix multiplication operation. Other embodiments may be described and claimed.

8.

发明申请
APPARATUS AND METHOD FOR MEMORY MANAGEMENT IN A GRAPHICS PROCESSING ENVIRONMENT 有权

公开(公告)号：US20210056051A1

公开(公告)日：2021-02-25

申请号：US17008991

申请日：2020-09-01

Applicant: Intel Corporation

Inventor： NIRANJAN L. COORAY , ABHISHEK R. APPU , ALTUG KOKER , JOYDEEP RAY , BALAJI VEMBU , PATTABHIRAMAN K , DAVID PUFFER , DAVID J. COWPERTHWAITE , RAJESH M. SANKARAN , SATYESHWAR SINGH , SAMEER KP , ANKUR N. SHAH , KUN TIAN

IPC: G06F13/16 , G06F12/1009 , G06F12/1027 , G06F12/1036 , G06F12/0802 , G06F13/40

Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.

9.

发明申请
DE-CENTRALIZED LOAD-BALANCING AT PROCESSORS 审中-公开

公开(公告)号：US20200210246A1

公开(公告)日：2020-07-02

申请号：US16696848

申请日：2019-11-26

Applicant: Intel Corporation

Inventor： PRASOONKUMAR SURTI , DAVID COWPERTHWAITE , ABHISHEK R. APPU , JOYDEEP RAY , VASANTH RANGANATHAN , ALTUG KOKER , BALAJI VEMBU

IPC: G06F9/50

Abstract: A mechanism is described for facilitating localized load-balancing for processors in computing devices. A method of embodiments, as described herein, includes facilitating hosting, at a processor of a computing device, a local load-balancing mechanism. The method may further include monitoring balancing of loads at the processor and serving as a local scheduler to maintain de-centralized load-balancing at the processor and between the processor and other one or more processors.

10.

发明申请
HANDLING PIPELINE SUBMISSIONS ACROSS MANY COMPUTE UNITS 审中-公开

公开(公告)号：US20180308195A1

公开(公告)日：2018-10-25

申请号：US15493233

申请日：2017-04-21

Applicant: Intel Corporation

Inventor： BALAJI VEMBU , ALTUG KOKER , JOYDEEP RAY

IPC: G06T1/20 , G06T15/00

CPC classification number: G06T1/20 , G06T15/005 , G06T2200/04

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising multiple processing units and a pipeline manager to distribute a thread group to the multiple processing units, wherein the pipeline manager is to distribute the thread group as multiple thread sub-groups.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification