-
公开(公告)号:US20200327635A1
公开(公告)日:2020-10-15
申请号:US16714862
申请日:2019-12-16
Applicant: Intel Corporation
Inventor: Altug Koker , Balaji Vembu , Joydeep Ray , James A. Valerio , Abhishek R. Appu
Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.
-
公开(公告)号:US10783084B2
公开(公告)日:2020-09-22
申请号:US16702073
申请日:2019-12-03
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Atlug Koker , Joydeep Ray , David Puffer , Prasoonkumar Surti , Lakshminarayanan Striramassarma , Vasanth Ranganathan , Kiran C. Veernapu , Balaji Vembu , Pattabhiraman K
IPC: G06F12/0877 , G06F12/0868 , G06F12/0846 , G06F12/0855 , G06F12/0802 , G06F12/0806 , G06F12/0893 , G06F12/126 , G06T1/60
Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20200241622A1
公开(公告)日:2020-07-30
申请号:US16782791
申请日:2020-02-05
Applicant: INTEL CORPORATION
Inventor: Abhishek R. Appu , Altug Koker , Eric J. Hoekstra , Kiran C. Veernapu , Prasoonkumar Surti , Vasanth Ranganathan , Kamal Sinha , Balaji Vembu , Eric J. Asperheim , Sanjeev S. Jahagirdar , Joydeep Ray
IPC: G06F1/3225 , G06F1/3234
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive data for a current write operation to a memory, determine a number of bits in the received data for the current write operation to the memory which have changed from a previous write operation to the memory and in response to a determination that the number of bits in the received data for the current write operation to the memory which have changed from a previous write operation to the memory exceeds a threshold, to toggle a plurality of bits in the data for the current write operation to create an encoded data set and set an indicator bit to a value which indicates that the plurality of bits have been toggled. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US10599438B2
公开(公告)日:2020-03-24
申请号:US16388444
申请日:2019-04-18
Applicant: Intel Corporation
Inventor: Balaji Vembu , Abhishek R. Appu , Joydeep Ray , Altug Koker
IPC: G06T1/20 , G06F9/50 , G06F9/48 , G06F9/38 , G06F9/46 , G06F9/52 , G06F9/54 , G06F15/16 , G06F15/76 , G06F12/0897 , G06F12/0866 , G06T1/60
Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.
-
公开(公告)号:US20200051203A1
公开(公告)日:2020-02-13
申请号:US16417132
申请日:2019-05-20
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajikshore Barik , Nicolas C. Galoppo Von Borries
IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , G06F9/48 , G06F17/16 , G06N3/04 , G06N3/08 , G06T1/60 , G06T15/00
Abstract: An apparatus to facilitate processing of a sparse matrix is disclosed. The apparatus includes a plurality of processing units each comprising one or more processing elements, including logic to read operands, a multiplication unit to multiply two or more operands and a scheduler to identify operands having a zero value and prevent scheduling of the operands having the zero value at the multiplication unit.
-
公开(公告)号:US10558254B2
公开(公告)日:2020-02-11
申请号:US15477042
申请日:2017-04-01
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Eric J. Hoekstra , Kiran C. Veernapu , Prasoonkumar Surti , Vasanth Ranganathan , Kamal Sinha , Balaji Vembu , Eric J. Asperheim , Sanjeev S. Jahagirdar , Joydeep Ray
IPC: G06F3/06 , G06F1/3225 , G06F1/3234
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive data for a current write operation to a memory, determine a number of bits in the received data for the current write operation to the memory which have changed from a previous write operation to the memory and in response to a determination that the number of bits in the received data for the current write operation to the memory which have changed from a previous write operation to the memory exceeds a threshold, to toggle a plurality of bits in the data for the current write operation to create an encoded data set and set an indicator bit to a value which indicates that the plurality of bits have been toggled. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20200042417A1
公开(公告)日:2020-02-06
申请号:US16526069
申请日:2019-07-30
Applicant: Intel Corporation
Inventor: Nikos Kaburlasos , Balaji Vembu , Josh B. Mastronarde , Altug Koker , Eric C. Samson , Abhishek R. Appu , Kiran C. Veernapu , Joydeep Ray , Vasanth Ranganathan , Sanjeev S. Jahagirdar
IPC: G06F11/30 , G06F1/324 , G06F1/3206 , G06F1/3296 , G05F1/10 , G05F1/571 , G06F11/32 , G06F11/34
Abstract: Methods and apparatus relating to techniques for power management. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to generate a voltage/frequency curve for at least one of a core or a sub-core in a processor and manage an operating voltage level of the at least one of a core or a sub-core using the voltage/frequency curve. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US10522114B2
公开(公告)日:2019-12-31
申请号:US15992642
申请日:2018-05-30
Applicant: Intel Corporation
Inventor: Jeffery S. Boles , Hema C. Nalluri , Balaji Vembu , Michael Apodaca , Altug Koker , Lalit K. Saptarshi
IPC: G06F12/0846 , G06F3/06 , G06T1/60 , G09G5/36 , G06T1/20 , G06F12/0895 , G06F12/0875
Abstract: In accordance with some embodiments, a command streamer may use a cache of programmable size to cache commands to improve memory bandwidth and reduce latency. The size of the command cache may be programmably set by the command streamer.
-
公开(公告)号:US10482028B2
公开(公告)日:2019-11-19
申请号:US15493757
申请日:2017-04-21
Applicant: Intel Corporation
Inventor: Altug Koker , Balaji Vembu , Joydeep Ray , Abhishek R. Appu
IPC: G06F12/0895 , G06F12/126 , G06F12/02 , G06T1/60
Abstract: A mechanism is described for facilitating optimization of cache associated with graphics processors at computing devices. A method of embodiments, as described herein, includes introducing coloring bits to contents of a cache associated with a processor including a graphics processor, wherein the coloring bits to represent a signal identifying one or more caches available for use, while avoiding explicit invalidations and flushes.
-
公开(公告)号:US10453427B2
公开(公告)日:2019-10-22
申请号:US15477030
申请日:2017-04-01
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Balaji Vembu , Murali Ramadoss , Guei-Yuan Lueh , James A. Valerio , Prasoonkumar Surti , Abhishek R. Appu , Vasanth Ranganathan , Kalyan Bhairavabhatla , Arthur D. Hunter, Jr. , Wei-Yu Chen , Subramaniam M. Maiyuran
IPC: G09G5/36 , G06F9/46 , G06F12/0875 , G09G5/00 , G06F12/084 , G06F12/0811
Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
-
-
-
-
-
-
-
-
-