-
171.
公开(公告)号:US10026142B2
公开(公告)日:2018-07-17
申请号:US14686476
申请日:2015-04-14
Applicant: INTEL CORPORATION
Inventor: Hema Chand Nalluri , Balaji Vembu , Jeffery S. Boles
Abstract: A mechanism is described for facilitating multi-level nesting of batch buffers at computing devices. A method of embodiments, as described herein, includes facilitating a hardware extension to accommodate a plurality of batch buffers to engage in a multi-level nesting, where the plurality of batch buffers are associated with a graphics processor of a computing device. The method may further include facilitating the multi-level nesting of the plurality of batch buffers, where the multi-level nesting is spread over a plurality of levels associated with the plurality of batch buffers, where the plurality of levels include more than two levels of nesting associated with more than two batch buffers of the plurality of batch buffers.
-
公开(公告)号:US10013734B1
公开(公告)日:2018-07-03
申请号:US15476985
申请日:2017-04-01
Applicant: Intel Corporation
Inventor: Jeffery S. Boles , Hema C. Nalluri , Balaji Vembu , Michael Apodaca , Altug Koker , Lalit K. Saptarshi
IPC: G06T1/60 , G06T1/20 , G06F12/0846 , G06F3/06 , G09G5/36
CPC classification number: G09G5/363 , G06F3/0659 , G06F12/0848 , G06F12/0875 , G06F12/0895 , G06F2212/455 , G06T1/20 , G06T1/60
Abstract: In accordance with some embodiments, a command streamer may use a cache of programmable size to cache commands to improve memory bandwidth and reduce latency. The size of the command cache may be programmably set by the command streamer.
-
公开(公告)号:US20170337656A1
公开(公告)日:2017-11-23
申请号:US15159897
申请日:2016-05-20
Applicant: Intel Corporation
Inventor: Balaji Vembu , Peter L. Doyle , Michael Apodaca , Hema C. Nalluri , Jeffery S. Boles
Abstract: The same set of render commands can be re-executed for each of a plurality of tiles making up a graphic scene to be rendered. Each time the list of commands is executed, the way the commands are executed may be modified based on information received from tile pre-processing. Specifically, a jump if command may be inserted into the command list. When this command is encountered, a determination is made, based on information received from tile pre-processing pipeline, whether to execute the command for the next primitive or not. If the next primitive is to be culled then the command for the next primitive is not executed and the flow moves past that command. If the next primitive is to be executed then the jump is not implemented. This enables avoiding reloading the same list of commands over and over for every tile.
-
174.
公开(公告)号:US20170256019A1
公开(公告)日:2017-09-07
申请号:US15062691
申请日:2016-03-07
Applicant: Intel Corporation
Inventor: Balaji Vembu , Kritika Bala , Murali Ramadoss , Hema Nalluri , Jeffery Boles , Jeffrey Frizzell , Joseph Koston
IPC: G06T1/20
CPC classification number: G06T1/20 , G06F9/4881 , G06F2009/45579
Abstract: Embodiments provide for an apparatus comprising a graphics processing subsystem including one or more graphics engines and a graphics scheduler to schedule a submission queue of multiple work items for execution on the one or more graphics engines of the graphics processing subsystem. The graphics scheduler can be configured to build the submission queue via a write to a memory mapped address that is mapped to logic within the graphics processing subsystem and to explicitly submit the submission queue to the graphics engine after the build of the submission queue.
-
公开(公告)号:US20160117497A1
公开(公告)日:2016-04-28
申请号:US14523884
申请日:2014-10-25
Applicant: Intel Corporation
Inventor: Paritosh Saxena , Adrian M.M.T. Dunbar , Michael S. Hughes , John Teddy , David Michael Durham , Balaji Vembu , Prashant Dewan , Debra Cablao , Nicholas D. Triantafillou , Craig D. Schmugar , Jason M. Surprise
CPC classification number: G06F21/566 , G06F21/52 , G06F21/74
Abstract: Computing platform security methods and apparatus are disclosed. An example apparatus includes a security application to configure a security task, the security task to detect a malicious element on a computing platform, the computing platform including a central processing unit and a graphics processing unit; and an offloader to determine whether the central processing unit or the graphics processing unit is to execute the security task; and when the graphics processing unit is to execute the security task, offload the security task to the graphics processing unit for execution.
-
公开(公告)号:US09223603B2
公开(公告)日:2015-12-29
申请号:US13932963
申请日:2013-07-01
Applicant: INTEL CORPORATION
Inventor: Balaji Vembu , Aditya Navale , Wishwesh Gandhi
CPC classification number: G06F12/1009 , G06F9/4401 , G06F9/4403 , G06F9/4411 , G06F9/45533 , G06F9/45558 , G06F12/109 , G06F13/28 , G06F2009/45579 , G06F2009/45583 , G06F2212/152 , G06T1/60
Abstract: A method and apparatus for creating, updating, and using guest physical address (GPA) to host physical address (HPA) shadow translation tables for translating GPAs of graphics data direct memory access (DMA) requests of a computing environment implementing a virtual machine monitor to support virtual machines. The requests may be sent through a render or display path of the computing environment from one or more virtual machines, transparently with respect to the virtual machine monitor. The creating, updating, and using may be performed by a memory controller detecting entries sent to existing global and page directory tables, forking off shadow table entries from the detected entries, and translating GPAs to HPAs for the shadow table entries.
Abstract translation: 一种用于创建,更新和使用访客物理地址(GPA)以主机物理地址(HPA)影子转换表的方法和装置,用于将实现虚拟机监视器的计算环境的图形数据直接存储器访问(DMA)请求的GPA转换为 支持虚拟机。 可以通过虚拟机监视器透明地从一个或多个虚拟机通过计算环境的呈现或显示路径发送请求。 创建,更新和使用可以由存储器控制器执行,该存储器控制器检测发送到现有全局和页目录表的条目,从检测到的条目中分离影子表条目,以及将影子表条目的GPA转换为HPA。
-
公开(公告)号:US20250117873A1
公开(公告)日:2025-04-10
申请号:US18906790
申请日:2024-10-04
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries
Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.
-
178.
公开(公告)号:US20250061534A1
公开(公告)日:2025-02-20
申请号:US18819073
申请日:2024-08-29
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.
-
179.
公开(公告)号:US12217053B2
公开(公告)日:2025-02-04
申请号:US18528340
申请日:2023-12-04
Applicant: Intel Corporation
Inventor: Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar
IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G09G5/393 , G06F1/16 , G06F17/16 , G06N20/00 , G06T15/00
Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.
-
公开(公告)号:US20250036451A1
公开(公告)日:2025-01-30
申请号:US18792866
申请日:2024-08-02
Applicant: Intel Corporation
Inventor: Balaji Vembu , Abhishek R. Appu , Joydeep Ray , Altug Koker
IPC: G06F9/46 , G06F9/38 , G06F9/48 , G06F9/50 , G06F9/52 , G06F9/54 , G06F12/0842 , G06F12/0866 , G06F12/0897 , G06F15/16 , G06F15/76 , G06T1/20 , G06T1/60
Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.
-
-
-
-
-
-
-
-
-