专利检索 ap:("Intel Corporation") AND inv:"Guei-Yuan Lueh" 第 3 页

21.

发明公开
INCREASING PROCESSING RESOURCES IN PROCESSING CORES OF A GRAPHICS ENVIRONMENT 审中-公开

公开(公告)号：US20240160478A1

公开(公告)日：2024-05-16

申请号：US17987185

申请日：2022-11-15

申请人： Intel Corporation

发明人： Jiasheng Chen , Chunhui Mei , Ben J. Ashbaugh , Naveen Matam , Joydeep Ray , Timothy Bauer , Guei-Yuan Lueh , Vasanth Ranganathan , Prashant Chaudhari , Vikranth Vemulapalli , Nishanth Reddy Pendluru , Piotr Reiter , Jain Philip , Marek Rudniewski , Christopher Spencer , Parth Damani , Prathamesh Raghunath Shinde , John Wiegert , Fataneh Ghodrat

IPC分类号： G06F9/50 , G06F12/0875

CPC分类号： G06F9/5016 , G06F12/0875 , G06F2212/452

摘要： An apparatus to facilitate increasing processing resources in processing cores of a graphics environment is disclosed. The apparatus includes a plurality of processing resources to execute one or more execution threads; a plurality of message arbiter-processing resource (MA-PR) routers, wherein a respective MA-PR router of the plurality of MA-PR routers corresponds to a pair of processing resources of the plurality of processing resources and is to arbitrate routing of a thread control message from a message arbiter between the pair of processing resources; a plurality of local shared cache (LSC) sequencers to provide an interface between at least one LSC of the processing core and the plurality of processing resources; and a plurality of instruction caches (ICs) to store instructions of the one or more execution threads, wherein a respective IC of the plurality of ICs interfaces with a portion of the plurality of processing resources.

22.

发明公开
INSTRUCTION PREFETCH MECHANISM 审中-公开

公开(公告)号：US20240086329A1

公开(公告)日：2024-03-14

申请号：US18470553

申请日：2023-09-20

申请人： Intel Corporation

发明人： Vasileios Porpodas , Guei-Yuan Lueh , Subramaniam Maiyuran , Wei-Yu Chen

IPC分类号： G06F12/0862 , G06F8/41 , G06F9/30 , G06F12/0875

CPC分类号： G06F12/0862 , G06F8/41 , G06F8/4442 , G06F9/30047 , G06F12/0875 , G06F2201/885 , G06F2212/1016 , G06F2212/452 , G06F2212/502 , G06F2212/602 , G06F2212/6028

摘要： An apparatus to facilitate data prefetching is disclosed. The apparatus includes a cache, one or more execution units (EUs) to execute program code, prefetch logic to maintain tracking information of memory instructions in the program code that trigger a cache miss and compiler logic to receive the tracking information, insert one or more pre-fetch instructions in updated program code to prefetch data from a memory for execution of one or more of the memory instructions that triggered a cache miss and download the updated program code for execution by the one or more EUs.

23.

发明公开
GRAPHICS PROCESSING UNIT PROCESSING AND CACHING IMPROVEMENTS 审中-公开

公开(公告)号：US20240078630A1

公开(公告)日：2024-03-07

申请号：US18490593

申请日：2023-10-19

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Durgaprasad Bilagi , Joydeep Ray , Scott Janus , Sanjeev Jahagirdar , Brent Insko , Lidong Xu , Abhishek R. Appu , James Holland , Vasanth Ranganathan , Nikos Kaburlasos , Altug Koker , Xinmin Tian , Guei-Yuan Lueh , Changliang Wang

IPC分类号： G06T1/60 , G06F12/0802 , G06N5/04 , G06T1/20

CPC分类号： G06T1/60 , G06F12/0802 , G06N5/04 , G06T1/20 , G06F2212/251

摘要： Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.

24.

发明公开
HYBRID LOW POWER HOMOGENOUS GRAPICS PROCESSING UNITS 审中-公开

公开(公告)号：US20240004713A1

公开(公告)日：2024-01-04

申请号：US18363339

申请日：2023-08-01

申请人： Intel Corporation

发明人： Abhishek R. APPU , Altug KOKER , Balaji VEMBU , Joydeep RAY , Kamal SINHA , Prasoonkumar SURTI , Kiran C. VEERNAPU , Subramaniam MAIYURAN , Sanjeev S. Jahagirdar , Eric J. Asperheim , Guei-Yuan Lueh , David Puffer , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Josh B. Mastronarde , Linda L. Hurd , Travis T. Schluessler , Tomasz Janczak , Abhishek Venkatesh , Kai Xiao , Slawomir Grajewski

IPC分类号： G06F9/50 , G06F1/329 , G06F9/48 , G06T1/20 , G06T1/60 , G06T15/00

CPC分类号： G06F9/5016 , G06F9/5044 , G06F1/329 , G06F9/4893 , G06T1/20 , G06T1/60 , G06T15/005 , Y02D10/00 , G06T2200/28

摘要： In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

25.

发明授权
Instruction prefetch mechanism 有权

公开(公告)号：US11803476B2

公开(公告)日：2023-10-31

申请号：US17210867

申请日：2021-03-24

申请人： Intel Corporation

发明人： Vasileios Porpodas , Guei-Yuan Lueh , Subramaniam Maiyuran , Wei-Yu Chen

IPC分类号： G06F12/0862 , G06F12/0875 , G06F9/30 , G06F8/41

CPC分类号： G06F12/0862 , G06F8/41 , G06F8/4442 , G06F9/30047 , G06F12/0875 , G06F2201/885 , G06F2212/1016 , G06F2212/452 , G06F2212/502 , G06F2212/602 , G06F2212/6028

摘要： An apparatus to facilitate data prefetching is disclosed. The apparatus includes a cache, one or more execution units (EUs) to execute program code, prefetch logic to maintain tracking information of memory instructions in the program code that trigger a cache miss and compiler logic to receive the tracking information, insert one or more pre-fetch instructions in updated program code to prefetch data from a memory for execution of one or more of the memory instructions that triggered a cache miss and download the updated program code for execution by the one or more EUs.

26.

发明授权
Data locality enhancement for graphics processing units 有权

公开(公告)号：US11726793B2

公开(公告)日：2023-08-15

申请号：US17095585

申请日：2020-11-11

申请人： Intel Corporation

发明人： Christopher J. Hughes , Prasoonkumar Surti , Guei-Yuan Lueh , Adam T. Lake , Jill Boyce , Subramaniam Maiyuran , Lidong Xu , James M. Holland , Vasanth Ranganathan , Nikos Kaburlasos , Altug Koker , Abhishek R. Appu

IPC分类号： G06F9/38 , G06F12/084 , G06T1/60 , G06F9/50 , G06F9/54

CPC分类号： G06F9/3891 , G06F9/5066 , G06F9/544 , G06F12/084 , G06T1/60

摘要： Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.

27.

发明授权
Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format 有权

公开(公告)号：US11709793B2

公开(公告)日：2023-07-25

申请号：US17827067

申请日：2022-05-27

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Shubra Marwaha , Ashutosh Garg , Supratim Pal , Jorge Parra , Chandra Gurram , Varghese George , Darin Starkey , Guei-Yuan Lueh

IPC分类号： G06T15/06 , G06F9/30 , G06F15/78 , G06F9/38 , G06F17/18 , G06F12/0802 , G06F7/544 , G06F7/575 , G06F12/02 , G06F12/0866 , G06F12/0875 , G06F12/0895 , G06F12/128 , G06F12/06 , G06F12/1009 , G06T1/20 , G06T1/60 , H03M7/46 , G06F12/0811 , G06F15/80 , G06F17/16 , G06F7/58 , G06F12/0871 , G06F12/0862 , G06F12/0897 , G06F9/50 , G06F12/0804 , G06F12/0882 , G06F12/0891 , G06F12/0893 , G06F12/0888 , G06N3/08

CPC分类号： G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/3004 , G06F9/30014 , G06F9/30036 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

摘要： Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

28.

发明授权
Register sharing mechanism to equally allocate disabled thread registers to active threads 有权

公开(公告)号：US11579878B2

公开(公告)日：2023-02-14

申请号：US16881920

申请日：2020-05-22

申请人： Intel Corporation

发明人： Pratik J. Ashar , Supratim Pal , Subramaniam Maiyuran , Wei-Yu Chen , Guei-Yuan Lueh

IPC分类号： G06F9/30 , G06F9/38 , G06F9/50 , G06F8/41

摘要： An apparatus is disclosed. The apparatus includes one or more processors comprising register sharing circuitry to receive meta-information indicating a number of threads that are to be disabled and provide an indication that an associated thread is disabled, a plurality of General Purpose Register Files (GRFs), wherein one or more of the plurality of GRFs is associated with one of the plurality of threads and a plurality of multiplexers coupled to the one or more GRFs to receive the indication from the register sharing circuitry and disable thread access to an associated GRF based on an indication that a thread is to be disabled.

29.

发明授权
Hierarchical general register file (GRF) for execution block 有权

公开(公告)号：US11507375B2

公开(公告)日：2022-11-22

申请号：US17319056

申请日：2021-05-12

申请人： Intel Corporation

发明人： Abhishek R. Appu , Altug Koker , Joydeep Ray , Kamal Sinha , Kiran C. Veernapu , Subramaniam Maiyuran , Prasoonkumar Surti , Guei-Yuan Lueh , David Puffer , Supratim Pal , Eric J. Hoekstra , Travis T. Schluessler , Linda L. Hurd

IPC分类号： G06F9/30 , G06T15/00 , G06T1/60 , G06T1/20 , G06F9/46 , G09G5/36 , G06F9/38

摘要： In an example, an apparatus comprises a plurality of execution units, and a first general register file (GRF) communicatively couple to the plurality of execution units, wherein the first GRF is shared by the plurality of execution units. Other embodiments are also disclosed and claimed.

30.

发明授权
Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format 有权

公开(公告)号：US11361496B2

公开(公告)日：2022-06-14

申请号：US17304092

申请日：2021-06-14

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Shubra Marwaha , Ashutosh Garg , Supratim Pal , Jorge Parra , Chandra Gurram , Varghese George , Darin Starkey , Guei-Yuan Lueh

IPC分类号： G06T15/06 , G06F9/30 , G06F9/38 , G06F17/18

摘要： Described herein is a graphics processing unit (GPU) comprising a single instruction, multiple thread (SIMT) multiprocessor comprising an instruction cache, a shared memory coupled with the instruction cache, and circuitry coupled with the shared memory and the instruction cache, the circuitry including multiple texture units, a first core including hardware to accelerate matrix operations, and a second core configured to receive an instruction having multiple operands in a bfloat16 (BF16) number format, wherein the multiple operands include a first source operand, a second source operand, and a third source operand, and the BF16 number format is a sixteen-bit floating point format having an eight-bit exponent and process the instruction, wherein to process the instruction includes to multiply the second source operand by the third source operand and add a first source operand to a result of the multiply.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类