Patent search ap:("INTEL CORPORATION") AND inv:"Altug Koker" Page 1

1.

发明授权
Compute optimization mechanism for deep neural networks 有权

公开(公告)号：US12198221B2

公开(公告)日：2025-01-14

申请号：US18436494

申请日：2024-02-08

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F8/41 , G06F9/455 , G06F9/50 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

2.

发明申请
COMPUTE OPTIMIZATION MECHANISM 有权

公开(公告)号：US20250005703A1

公开(公告)日：2025-01-02

申请号：US18773094

申请日：2024-07-15

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. Macpherson , John C. Weast , Feng Chen , Farshad Akhbari , Narayan Srinivasa , Nadathur Rajagopalan Satish , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman

IPC: G06T1/20 , G06F3/14 , G06F9/30 , G06F9/38 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T15/00 , G06T15/04 , G09G5/36

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core including mixed-precision execution circuitry to execute one or more of the mixed-precision instructions to perform a mixed-precision dot-product operation comprising to perform a set of multiply and accumulate operations.

3.

发明申请
HARDWARE SOFTWARE COMMUNICATION CHANNEL TO SUPPORT DIRECT PROGRAMMING INTERFACE METHODS ON FPGA-BASED PROTOTYPE PLATFORMS 有权

公开(公告)号：US20240427679A1

公开(公告)日：2024-12-26

申请号：US18756550

申请日：2024-06-27

Applicant: Intel Corporation

Inventor： Renu Patle , Hanmanthrao Patli , Rakesh Mehta , Hagay Spector , Ivan Herrera Mejia , Fylur Rahman Sathakathulla , Gowtham Raj Karnam , Mohsin Ali , Sahar Sharabi , Abraham Halevi Fraenkel , Eyal Pniel , Ehud Cohn , Raghav Ramesh Lakshmi , Altug Koker

IPC: G06F11/26 , G06F9/30

Abstract: Described herein is a generic hardware/software communication (HSC) channel that facilitates the re-use of pre-silicon DPI methods to enable FPGA-based post-silicon validation. The HSC channel translates a DPI interface into a hardware FIFO based mechanism. This translation allows the reuse of the methods without having to re-implement the entire flow in pure hardware. The core logic for the transactor remains the same, while only a small layer of the transactor is converted into the FIFO based mechanism.

4.

发明授权
Systems and methods for error detection and control for embedded memory and compute elements 有权

公开(公告)号：US12147302B2

公开(公告)日：2024-11-19

申请号：US17095530

申请日：2020-11-11

Applicant: Intel Corporation

Inventor： Vasanth Ranganathan , Joydeep Ray , Abhishek R. Appu , Nikos Kaburlasos , Lidong Xu , Subramaniam Maiyuran , Altug Koker , Naveen Matam , James Holland , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Durgaprasad Bilagi , Xinmin Tian

IPC: G06F11/10 , G06F12/0802 , G06T1/20 , G06T1/60

Abstract: Apparatuses including a graphics processing unit, graphics multiprocessor, or graphics processor having an error detection correction logic for cache memory or shared memory are disclosed. In one embodiment, a graphics multiprocessor includes cache or local memory for storing data and error detection correction circuitry integrated with or coupled to the cache or local memory. The error detection correction circuitry is configured to perform a tag read for data of the cache or local memory to check error detection correction information.

5.

发明授权
Systems and methods for cache optimization 有权

公开(公告)号：US12124383B2

公开(公告)日：2024-10-22

申请号：US17862739

申请日：2022-07-12

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian

IPC: G06F12/00 , G06F12/0875 , G06F12/0891 , G06F12/123 , G06T1/60

CPC classification number: G06F12/123 , G06F12/0875 , G06F12/0891 , G06T1/60 , G06F2212/302

Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.

6.

发明授权
Compression techniques 有权

公开(公告)号：US12093210B2

公开(公告)日：2024-09-17

申请号：US17430574

申请日：2020-03-14

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Aravindh Anantaraman , Elmoustapha Ould-Ahmed-Vall , Joydeep Ray , Mike Macpherson , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Subramaniam Maiyuran , Vasanth Ranganathan , Jayakrishna P S , K Pattabhiraman , Sudhakar Kamma

IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06

CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

Abstract: Methods and apparatus relating to techniques for data compression. In an example, an apparatus comprises a processor receive a data compression instruction for a memory segment; and in response to the data compression instruction, compress a sequence of identical memory values in response to a determination that the sequence of identical memory values has a length which exceeds a threshold. Other embodiments are also disclosed and claimed.

7.

发明公开
BARRIERS AND SYNCHRONIZATION FOR MACHINE LEARNING AT AUTONOMOUS MACHINES 审中-公开

公开(公告)号：US20240280987A1

公开(公告)日：2024-08-22

申请号：US18595649

申请日：2024-03-05

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G05D1/00 , G06F9/46 , G06F9/48 , G06F9/52 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T1/20

CPC classification number: G05D1/0088 , G06F9/4881 , G06F9/522 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F9/46 , G06T1/20

Abstract: A mechanism is described for facilitating barriers and synchronization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting thread groups relating to machine learning associated with one or more processing devices. The method may further include facilitating barrier synchronization of the thread groups across multiple dies such that each thread in a thread group is scheduled across a set of compute elements associated with the multiple dies, where each die represents a processing device of the one or more processing devices, the processing device including a graphics processor.

8.

发明公开
DATA PREFETCHING FOR GRAPHICS DATA PROCESSING 审中-公开

公开(公告)号：US20240256456A1

公开(公告)日：2024-08-01

申请号：US18391346

申请日：2023-12-20

Applicant: Intel Corporation

Inventor： Vikranth Vemulapalli , Lakshminarayanan Striramassarma , Mike MacPherson , Aravindh Anantaraman , Ben Ashbaugh , Murali Ramadoss , William B. Sadler , Jonathan Pearce , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter, Jr. , Prasoonkumar Surti , Nicolas Galoppo von Borries , Joydeep Ray , Abhishek R. Appu , ElMoustapha Ould-Ahmed-Vall , Altug Koker , Sungye Kim , Subramaniam Maiyuran , Valentin Andrei

IPC: G06F12/0862 , G06T1/20 , G06T1/60

CPC classification number: G06F12/0862 , G06T1/20 , G06T1/60 , G06F2212/602 , G06F2212/608

Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the Li cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.

9.

发明公开
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 审中-公开

公开(公告)号：US20240184572A1

公开(公告)日：2024-06-06

申请号：US18528340

申请日：2023-12-04

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N20/00 , G06T15/00 , G09G5/393

CPC classification number: G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30036 , G06F9/3851 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G09G5/393 , G06F9/30025 , G06F9/3013 , G06F17/16 , G06F2207/3824 , G06N20/00 , G06T15/005

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

10.

发明公开
METHOD AND APPARATUS FOR SHARED VIRTUAL MEMORY TO MANAGE DATA COHERENCY IN A HETEROGENEOUS PROCESSING SYSTEM 审中-公开

公开(公告)号：US20240152457A1

公开(公告)日：2024-05-09

申请号：US18531432

申请日：2023-12-06

Applicant: Intel Corporation

Inventor： Altug Koker

IPC: G06F12/0815 , G06F12/0804 , G06F12/0811 , G06F12/1009 , G06F12/1045

CPC classification number: G06F12/0815 , G06F12/0804 , G06F12/0811 , G06F12/1009 , G06F12/1045 , G06F12/1063 , G06F2212/1021 , G06F2212/1024 , G06F2212/281 , G06F2212/302 , G06F2212/608 , G06F2212/656 , G06F2212/657 , G06F2212/68 , G06F2212/682 , G06F2212/684

Abstract: Embodiments described herein provide a scalable coherency tracking implementation that utilizes shared virtual memory to manage data coherency. In one embodiment, coherency tracking granularity is reduced relative to existing coherency tracking solutions, with coherency tracking storage memory moved to memory as a page table metadata. For example and in one embodiment, storage for coherency state is moved from dedicated hardware blocks to system memory, effectively providing a directory structure that is limitless in size.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification