-
1.
公开(公告)号:WO2023022806A1
公开(公告)日:2023-02-23
申请号:PCT/US2022/036112
申请日:2022-07-05
Applicant: INTEL CORPORATION
Inventor: KIM, Sungye , VAIDYANATHAN, Karthik , LIKTOR, Gabor , THOMAS, Manu Mathew
Abstract: One embodiment provides a graphics processor comprising a set of processing resources configured to perform a supersampling operation via a mixed precision convolutional neural network, the set of processing resources including circuitry configured to receive, at an input block of a neural network model, history data, velocity data, and current frame data, pre-process the history data, velocity data, and current frame data to generate pre-processed data, provide the pre-processed data to a feature extraction network of the neural network model, process the pre-processed data at the feature extraction network via one or more encoder stages and one or more decoder stages, and generate an output image via an output block of the neural network model via direct reconstruction or kernel prediction.
-
公开(公告)号:WO2020190431A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/017995
申请日:2020-02-12
Applicant: INTEL CORPORATION , ASHBAUGH, Ben , PEARCE, Jonathan , RAMADOSS, Murali , VEMULAPALLI, Vikranth , SADLER, William B. , KIM, Sungye , PETRE, Marian Alin
Inventor: ASHBAUGH, Ben , PEARCE, Jonathan , RAMADOSS, Murali , VEMULAPALLI, Vikranth , SADLER, William B. , KIM, Sungye , PETRE, Marian Alin
Abstract: Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.
-
公开(公告)号:WO2020190425A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/017743
申请日:2020-02-11
Applicant: INTEL CORPORATION , RAY, Joydeep , ANANTARAMAN, Aravindh , APPU, Abhishek R. , KOKER, Altug , OULD-AHMED-VALL, ElMoustapha , ANDREI, Valentin , MAIYURAN, Subramaniam , GALOPPO VON BORRIES, Nicolas , MACPHERSON, Mike , ASHBAUGH, Ben , RAMADOSS, Murali , VEMULAPALLI, Vikranth , SADLER, William , PEARCE, Jonathan , KIM, Sungye , GEORGE, Varghese
Inventor: RAY, Joydeep , ANANTARAMAN, Aravindh , APPU, Abhishek R. , KOKER, Altug , OULD-AHMED-VALL, ElMoustapha , ANDREI, Valentin , MAIYURAN, Subramaniam , GALOPPO VON BORRIES, Nicolas , MACPHERSON, Mike , ASHBAUGH, Ben , RAMADOSS, Murali , VEMULAPALLI, Vikranth , SADLER, William , PEARCE, Jonathan , KIM, Sungye , GEORGE, Varghese
Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
-
公开(公告)号:WO2023081565A1
公开(公告)日:2023-05-11
申请号:PCT/US2022/077598
申请日:2022-10-05
Applicant: INTEL CORPORATION
Inventor: THOMAS, Manu, Mathew , VAIDYANATHAN, Karthik , KAPLANYAN, Anton , KIM, Sungye , LIKTOR, Gabor
Abstract: Joint denoising and supersampling of graphics data is described. An example of a graphics processor includes multiple processing resources, including a least a first processing resource including a pipeline to perform a supersampling operation; and the pipeline including circuitry to jointly perform denoising and supersampling of received ray tracing input data, the circuitry including first circuitry to receive input data associated with an input block for a neural network, second circuitry to perform operations associated with a feature extraction and kernel prediction network of the neural network, and third circuitry to perform operations associated with a filtering block of the neural network.
-
公开(公告)号:WO2020190429A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/017897
申请日:2020-02-12
Applicant: INTEL CORPORATION , VEMULAPALLI, Vikranth , STRIRAMASSARMA, Lakshminarayanan , MACPHERSON, Mike , ANANTARAMAN, Aravindh , ASHBAUGH, Ben , RAMADOSS, Murali , SADLER, William B. , PEARCE, Jonathan , JANUS, Scott , INSKO, Brent , RANGANATHAN, Vasanth , SINHA, Kamal , HUNTER, Arthur , SURTI, Prasoonkumar , GALOPPO VON BORRIES, Nicolas , RAY, Joydeep , APPU, Abhisek R. , OULD-AHMED-VALL, ElMoustapha , KOKER, Altug , KIM, Sungye , MAIYURAN, Subramaniam , ANDREI, Valentin
Inventor: VEMULAPALLI, Vikranth , STRIRAMASSARMA, Lakshminarayanan , MACPHERSON, Mike , ANANTARAMAN, Aravindh , ASHBAUGH, Ben , RAMADOSS, Murali , SADLER, William B. , PEARCE, Jonathan , JANUS, Scott , INSKO, Brent , RANGANATHAN, Vasanth , SINHA, Kamal , HUNTER, Arthur , SURTI, Prasoonkumar , GALOPPO VON BORRIES, Nicolas , RAY, Joydeep , APPU, Abhisek R. , OULD-AHMED-VALL, ElMoustapha , KOKER, Altug , KIM, Sungye , MAIYURAN, Subramaniam , ANDREI, Valentin
IPC: G06F12/0862 , G06F12/0897 , G06F12/0888 , G06F9/38
Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
-
公开(公告)号:WO2020190422A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/017521
申请日:2020-02-10
Applicant: INTEL CORPORATION , RAMADOSS, Murali , VEMULAPALLI, Vikranth , COORAY, Niran , SADLER, William B. , PEARCE, Jonathan D. , PETRE, Marian Alin , ASHBAUGH, Ben , OULD-AHMED-VALL, ElMoustapha , GALOPPO VON BORRIES, Nicolas , KOKER, Altug , ANANTARAMAN, Aravindh , MAIYURAN, Subramaniam , GEORGE, Varghese , KIM, Sungye , VALENTIN, Andrei
Inventor: RAMADOSS, Murali , VEMULAPALLI, Vikranth , COORAY, Niran , SADLER, William B. , PEARCE, Jonathan D. , PETRE, Marian Alin , ASHBAUGH, Ben , OULD-AHMED-VALL, ElMoustapha , GALOPPO VON BORRIES, Nicolas , KOKER, Altug , ANANTARAMAN, Aravindh , MAIYURAN, Subramaniam , GEORGE, Varghese , KIM, Sungye , VALENTIN, Andrei
Abstract: Methods and apparatus relating to predictive page fault handling. In an example, an apparatus comprises a processor to receive a virtual address that triggered a page fault for a compute process, check a virtual memory space for a virtual memory allocation for the compute process that triggered the page fault and manage the page fault according to one of a first protocol in response to a determination that the virtual address that triggered the page fault is a last page in the virtual memory allocation for the compute process, or a second protocol in response to a determination that the virtual address that triggered the page fault is not a last page in the virtual memory allocation for the compute process. Other embodiments are also disclosed and claimed.