-
公开(公告)号:US20190391937A1
公开(公告)日:2019-12-26
申请号:US16453995
申请日:2019-06-26
Applicant: Intel Corporation
Inventor: NIRANJAN L. COORAY , ABHISHEK R. APPU , ALTUG KOKER , JOYDEEP RAY , BALAJI VEMBU , PATTABHIRAMAN K , DAVID PUFFER , DAVID J. COWPERTHWAITE , RAJESH M. SANKARAN , SATYESHWAR SINGH , SAMEER KP , ANKUR N. SHAH , KUN TIAN
IPC: G06F13/16 , G06F13/40 , G06F12/0802 , G06F12/1036 , G06F12/1027 , G06F12/1009
Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.
-
公开(公告)号:US20190332869A1
公开(公告)日:2019-10-31
申请号:US16379176
申请日:2019-04-09
Applicant: Intel Corporation
Inventor: MAYURESH M. VARERKAR , BARNAN DAS , NARAYAN BISWAL , STANLEY J. BARAN , GOKCEN CILINGIR , NILESH V. SHAH , ARCHIE SHARMA , SHERINE ABDELHAK , SACHIN GODSE , FARSHAD AKHBARI , NARAYAN SRINIVASA , ALTUG KOKER , NADATHUR RAJAGOPALAN SATISH , DUKHWAN KIM , FENG CHEN , ABHISHEK R. APPU , JOYDEEP RAY , PING T. TANG , MICHAEL S. STRICKLAND , XIAOMING CHEN , ANBANG YAO , TATIANA SHPEISMAN , VASANTH RANGANATHAN , SANJEEV JAHAGIRDAR
Abstract: A mechanism is described for facilitating person tracking and data security in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, by a camera associated with one or more trackers, a person within a physical vicinity, where detecting includes capturing one or more images the person. The method may further include tracking, by the one or more trackers, the person based on the one or more images of the person, where tracking includes collect tracking data relating to the person. The method may further include selecting a tracker of the one or more trackers as a preferred tracker based on the tracking data.
-
公开(公告)号:US20180300964A1
公开(公告)日:2018-10-18
申请号:US15488914
申请日:2017-04-17
Applicant: Intel Corporation
Inventor: BARATH LAKSHAMANAN , LINDA L. HURD , BEN J. ASHBAUGH , ELMOUSTAPHA OULD-AHMED-VALL , LIWEI MA , JINGYI JIN , JUSTIN E. GOTTSCHLICH , CHANDRASEKARAN SAKTHIVEL , MICHAEL S. STRICKLAND , BRIAN T. LEWIS , LINDSEY KUPER , ALTUG KOKER , ABHISHEK R. APPU , PRASOONKUMAR SURTI , JOYDEEP RAY , BALAJI VEMBU , JAVIER S. TUREK , NAILA FAROOQUI
CPC classification number: G07C5/008 , B60W30/00 , G01C21/34 , G01S19/13 , G05D1/0088 , G05D2201/0213 , G06F9/5027 , G06F2209/509 , G06N20/00 , G08G1/0112 , G08G1/012 , G08G1/052 , H04L43/0852 , H04L67/12 , H04W28/08
Abstract: One embodiment provides for a computing device within an autonomous vehicle, the compute device comprising a wireless network device to enable a wireless data connection with an autonomous vehicle network, a set of multiple processors including a general-purpose processor and a general-purpose graphics processor, the set of multiple processors to execute a compute manager to manage execution of compute workloads associated with the autonomous vehicle, the compute workload associated with autonomous operations of the autonomous vehicle, and offload logic configured to execute on the set of multiple processors, the offload logic to determine to offload one or more of the compute workloads to one or more autonomous vehicles within range of the wireless network device.
-
公开(公告)号:US20180293690A1
公开(公告)日:2018-10-11
申请号:US15482685
申请日:2017-04-07
Applicant: Intel Corporation
Inventor: JOYDEEP RAY , ABHISHEK R. APPU , ALTUG KOKER , BALAJI VEMBU
IPC: G06T1/20 , G06T1/60 , G06F12/0875
CPC classification number: G06T1/20 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0875 , G06F12/0888 , G06F2212/1024 , G06F2212/302 , G06F2212/455 , G06F2212/621 , G06T1/60
Abstract: An apparatus and method are described for managing data which is biased towards a processor or a GPU. For example, one embodiment of an apparatus comprises: a processor comprising one or more cores to execute instructions and process data, one or more cache levels, and cache coherence controllers to maintain coherent data in the one or more cache levels; a graphics processing unit (GPU) to execute graphics instructions and process graphics data, wherein the GPU and processor cores are to share a virtual address space for accessing a system memory; a GPU memory coupled to the GPU, the GPU memory addressable through the virtual address space shared by the processor cores and GPU; and bias management circuitry to store an indication, for each of a plurality of blocks of data, whether the data has a processor bias or a GPU bias, wherein if the data has a GPU bias, then the data is to be accessed by the GPU from the GPU memory without necessarily accessing the processor's cache coherence controllers and wherein requests for the data from the processor cores are processed as uncached requests, preventing the data from being cached in the one or more cache levels of the processor.
-
公开(公告)号:US20250028675A1
公开(公告)日:2025-01-23
申请号:US18791963
申请日:2024-08-01
Applicant: Intel Corporation
Inventor: JOYDEEP RAY , SELVAKUMAR PANNEER , SAURABH TANGRI , BEN ASHBAUGH , SCOTT JANUS , ABHISHEK APPU , VARGHESE GEORGE , RAVISHANKAR IYER , NILESH JAIN , PATTABHIRAMAN K , ALTUG KOKER , MIKE MACPHERSON , JOSH MASTRONARDE , ELMOUSTAPHA OULD-AHMED-VALL , JAYAKRISHNA P. S , ERIC SAMSON
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06N3/08 , G06T1/20 , G06T1/60 , G06T15/06 , H03M7/46
Abstract: Embodiments described herein include software, firmware, and hardware that provides techniques to enable deterministic scheduling across multiple general-purpose graphics processing units. One embodiment provides a multi-GPU architecture with uniform latency. One embodiment provides techniques to distribute memory output based on memory chip thermals. One embodiment provides techniques to enable thermally aware workload scheduling. One embodiment provides techniques to enable end to end contracts for workload scheduling on multiple GPUs.
-
公开(公告)号:US20230377209A1
公开(公告)日:2023-11-23
申请号:US18322194
申请日:2023-05-23
Applicant: Intel Corporation
Inventor: ABHISHEK R. APPU , PRASOONKUMAR SURTI , JILL BOYCE , SUBRAMANIAM MAIYURAN , MICHAEL APODACA , ADAM T. LAKE , JAMES HOLLAND , VASANTH RANGANATHAN , ALTUG KOKER , LIDONG XU , NIKOS KABURLASOS
CPC classification number: G06T9/002 , G06T9/007 , G06T15/005 , G06T9/008 , G06N3/045
Abstract: Embodiments described herein provided for an instruction and associated logic to enable a processing resource including a tensor accelerator to perform optimized computation of sparse submatrix operations. One embodiment provides a parallel processor comprising a processing cluster coupled with the cache memory. The processing cluster includes a plurality of multiprocessors coupled with a data interconnect, where a multiprocessor of the plurality of multiprocessors includes a tensor core configured to load tensor data and metadata associated with the tensor data from the cache memory, wherein the metadata indicates a first numerical transform applied to the tensor data, perform an inverse transform of the first numerical transform, perform a tensor operation on the tensor data after the inverse transform is performed, and write output of the tensor operation to a memory coupled with the processing cluster.
-
公开(公告)号:US20220122215A1
公开(公告)日:2022-04-21
申请号:US17428216
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: JOYDEEP RAY , SELVAKUMAR PANNEER , SAURABH TANGRI , BEN ASHBAUGH , SCOTT JANUS , ABHISHEK APPU , VARGHESE GEORGE , RAVISHANKAR IYER , NILESH JAIN , PATTABHIRAMAN K , ALTUG KOKER , MIKE MACPHERSON , JOSH MASTRONARDE , ELMOUSTAPHA OULD-AHMED-VALL , JAYAKRISHNA P. S , ERIC SAMSON
IPC: G06T1/60 , G06F12/06 , G06F12/1009 , G06T1/20 , G06F12/0875 , G06F9/38
Abstract: Embodiments described herein include software, firmware, and hardware that provides techniques to enable deterministic scheduling across multiple general-purpose graphics processing units. One embodiment provides a multi-GPU architecture with uniform latency. One embodiment provides techniques to distribute memory output based on memory chip thermals. One embodiment provides techniques to enable thermally aware workload scheduling. One embodiment provides techniques to enable end to end contracts for workload scheduling on multiple GPUs.
-
公开(公告)号:US20220066931A1
公开(公告)日:2022-03-03
申请号:US17310540
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: JOYDEEP RAY , NIRANJAN COORAY , SUBRAMANIAM MAIYURAN , ALTUG KOKER , PRASOONKUMAR SURTI , VARGHESE GEORGE , VALENTIN ANDREI , ABHISHEK APPU , GUADALUPE GARCIA , PATTABHIRAMAN K , SUNGYE KIM , SANJAY KUMAR , PRATIK MAROLIA , ELMOUSTAPHA OULD-AHMED-VALL , VASANTH RANGANATHAN , WILLIAM SADLER , LAKSHMINARAYANAN STRIRAMASSARMA
IPC: G06F12/0802
Abstract: Embodiments described herein provide techniques to enable the dynamic reconfiguration of memory on a general-purpose graphics processing unit. One embodiment described herein enables dynamic reconfiguration of cache memory bank assignments based on hardware statistics. One embodiment enables for virtual memory address translation using mixed four kilobyte and sixty-four kilobyte pages within the same page table hierarchy and under the same page directory. One embodiment provides for a graphics processor and associated heterogenous processing system having near and far regions of the same level of a cache hierarchy.
-
公开(公告)号:US20200310973A1
公开(公告)日:2020-10-01
申请号:US16366266
申请日:2019-03-27
Applicant: Intel Corporation
Inventor: NIRANJAN L. COORAY , ALTUG KOKER , VIDHYA KRISHNAN , RONALD W. SILVAS , JOHN H. FEIT , PRASOONKUMAR SURTI , JOYDEEP RAY , ABHISHEK R. APPU
IPC: G06F12/0837 , G06F9/38 , H04L9/06 , G06F16/907
Abstract: Embodiments described herein provide an apparatus comprising a processor to allocate a first memory space for data for a graphics workload, the first memory comprising a first plurality of addressable memory locations, allocate a second memory space for compression metadata relating to the data for the graphics workload, the second memory space comprising a second plurality of addressable memory locations and having an amount of memory corresponding to a predetermined ratio of the amount of memory allocated to first memory space, and configure a direct memory mapping between the first plurality of addressable memory locations and the second plurality of addressable memory locations. Other embodiments may be described and claimed.
-
公开(公告)号:US20200293488A1
公开(公告)日:2020-09-17
申请号:US16354782
申请日:2019-03-15
Applicant: Intel Corporation
Inventor: JOYDEEP RAY , ARAVINDH ANANTARAMAN , ABHISHEK R. APPU , ALTUG KOKER , ELMOUSTAPHA OULD-AHMED-VALL , VALENTIN ANDREI , SUBRAMANIAM MAIYURAN , NICOLAS GALAPPO VON BORRIES , VARGHESE GEORGE , MIKE MACPHERSON , BEN ASHBAUGH , MURALI RAMADOSS , VIKRANTH VEMULAPALLI , WILLIAM SADLER , JONATHAN PEARCE , SUNGYE KIM
Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-