-
公开(公告)号:US20220398101A1
公开(公告)日:2022-12-15
申请号:US17848559
申请日:2022-06-24
Applicant: Intel Corporation
Inventor: Balaji Vembu , Abhishek R. Appu , Joydeep Ray , Altug Koker
IPC: G06F9/38 , G06F9/46 , G06T1/20 , G06F9/52 , G06F9/48 , G06F9/54 , G06F15/16 , G06F9/50 , G06F15/76 , G06F12/0897 , G06F12/0866 , G06T1/60
Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.
-
公开(公告)号:US20220358618A1
公开(公告)日:2022-11-10
申请号:US17870169
申请日:2022-07-21
Applicant: Intel Corporation
Inventor: Balaji Vembu , Altug Koker , Joydeep Ray
Abstract: One embodiment provides a graphics processor including a plurality of processing clusters, each processing cluster including a plurality of multiprocessors and a data interconnect coupled to the plurality of multiprocessors. At least one multiprocessor of the plurality of multiprocessors is configured to share data with another multiprocessor over the data interconnect.
-
193.
公开(公告)号:US20220357945A1
公开(公告)日:2022-11-10
申请号:US17834482
申请日:2022-06-07
Applicant: Intel Corporation
Inventor: Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar
Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.
-
公开(公告)号:US11494967B2
公开(公告)日:2022-11-08
申请号:US17173892
申请日:2021-02-11
Applicant: Intel Corporation
Inventor: Balaji Vembu , Murali Ramadoss , David I. Standring , Shruti A. Sethi , Jeffrey S. Frizzell , Alan M. Curtis , Abhishek R. Appu , Joydeep Ray , Altug Koker
IPC: G06T15/00 , G06F12/084 , G06F12/10 , G06F12/02 , G06F12/0811 , G06F12/0813 , G06F13/16 , G06F12/0831
Abstract: In an example, an apparatus comprises a plurality of execution units, and logic, at least partially including hardware logic, to create a scatter gather list in memory and collect a plurality of operating statistics for the plurality of execution units using the scatter gather list. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11494232B2
公开(公告)日:2022-11-08
申请号:US17103626
申请日:2020-11-24
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Balaji Vembu , James A. Valerio , Abhishek R. Appu
Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.
-
公开(公告)号:US20220351325A1
公开(公告)日:2022-11-03
申请号:US17749266
申请日:2022-05-20
Applicant: Intel Corporation
Inventor: Altug Koker , Ingo Wald , David Puffer , Subramaniam M. Maiyuran , Prasoonkumar Surti , Balaji Vembu , Guei-Yuan Lueh , Murali Ramadoss , Abhishek R. Appu , Joydeep Ray
Abstract: One embodiment provides a parallel processor comprising a memory interface and a processing array coupled with the memory interface. The processing array is configured to address memory accessed via the memory interface via a virtual address mapping and includes circuitry to resolve a page fault for the virtual address mapping, wherein each of the multiple compute blocks is separately preemptable.
-
公开(公告)号:US20220334977A1
公开(公告)日:2022-10-20
申请号:US17715734
申请日:2022-04-07
Applicant: Intel Corporation
Inventor: Altug Koker , Balaji Vembu , Joydeep Ray , Abhishek R. Appu
IPC: G06F12/0895 , G06F12/126 , G06F12/02 , G06T1/60
Abstract: A mechanism is described for facilitating optimization of cache associated with graphics processors at computing devices. A method of embodiments, as described herein, includes introducing coloring bits to contents of a cache associated with a processor including a graphics processor, wherein the coloring bits to represent a signal identifying one or more caches available for use, while avoiding explicit invalidations and flushes.
-
公开(公告)号:US11461959B2
公开(公告)日:2022-10-04
申请号:US16865587
申请日:2020-05-04
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Karthik Vaidyanathan , Atsuo Kuwahara , Hugues Labbe , Sameer Kp , Jonathan Kennedy , Abhishek R. Appu , Jeffery S. Boles , Balaji Vembu , Michael Apodaca , Slawomir Grajewski , Gabor Liktor , David M. Cimini , Andrew T. Lauritzen , Travis T. Schluessler , Murali Ramadoss , Abhishek Venkatesh , Joydeep Ray , Kai Xiao , Ankur N. Shah , Altug Koker
Abstract: The systems, apparatuses and methods may provide a way to adaptively process and aggressively cull geometry data. Systems, apparatuses and methods may provide for processing, by a positional only shading pipeline (POSH), geometry data including surface triangles for a digital representation of a scene. More particularly, systems, apparatuses and methods may provide a way to identify surface triangles in one or more exclusion zones and non-exclusion zones, and cull surface triangles in one or more exclusion zones.
-
公开(公告)号:US11430082B2
公开(公告)日:2022-08-30
申请号:US17143805
申请日:2021-01-07
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , John C. Weast , Mike B. Macpherson , Linda L. Hurd , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Liwei Ma , Elmoustapha Ould-Ahmed-Vall , Kamal Sinha , Joydeep Ray , Balaji Vembu , Sanjeev Jahagirdar , Vasanth Ranganathan , Dukhwan Kim
Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.
-
公开(公告)号:US20220230269A1
公开(公告)日:2022-07-21
申请号:US17591152
申请日:2022-02-02
Applicant: Intel Corporation
Inventor: Balaji Vembu , Altug Koker , Joydeep Ray
Abstract: One embodiment provides an apparatus comprising an interconnect fabric comprising one or more fabric switches, a plurality of memory interfaces coupled to the interconnect fabric to provide access to a plurality of memory devices, an input/output (IO) interface coupled to the interconnect fabric to provide access to IO devices, an array of multiprocessors coupled to the interconnect fabric, scheduling circuitry to distribute a plurality of thread groups across the array of multiprocessors, each thread group comprising a plurality of threads and each thread comprising a plurality of instructions to be executed by at least one of the multiprocessors, and a first multiprocessor of the array of multiprocessors to be assigned to process a first thread group comprising a first plurality of threads, the first multiprocessor comprising a plurality of parallel execution circuits.
-
-
-
-
-
-
-
-
-