-
公开(公告)号:US12197601B2
公开(公告)日:2025-01-14
申请号:US17560193
申请日:2021-12-22
Applicant: Intel Corporation
Inventor: Ren Wang , Sameh Gobriel , Somnath Paul , Yipeng Wang , Priya Autee , Abhirupa Layek , Shaman Narayana , Edwin Verplanke , Mrittika Ganguli , Jr-Shian Tsai , Anton Sorokin , Suvadeep Banerjee , Abhijit Davare , Desmond Kirkpatrick , Rajesh M. Sankaran , Jaykant B. Timbadiya , Sriram Kabisthalam Muthukumar , Narayan Ranganathan , Nalini Murari , Brinda Ganesh , Nilesh Jain
Abstract: Examples described herein relate to offload circuitry comprising one or more compute engines that are configurable to perform a workload offloaded from a process executed by a processor based on a descriptor particular to the workload. In some examples, the offload circuitry is configurable to perform the workload, among multiple different workloads. In some examples, the multiple different workloads include one or more of: data transformation (DT) for data format conversion, Locality Sensitive Hashing (LSH) for neural network (NN), similarity search, sparse general matrix-matrix multiplication (SpGEMM) acceleration of hash based sparse matrix multiplication, data encode, data decode, or embedding lookup.
-
公开(公告)号:US12135981B2
公开(公告)日:2024-11-05
申请号:US18207870
申请日:2023-06-09
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11416281B2
公开(公告)日:2022-08-16
申请号:US16474978
申请日:2016-12-31
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11163682B2
公开(公告)日:2021-11-02
申请号:US14983052
申请日:2015-12-29
Applicant: Intel Corporation
Inventor: Francesc Guim Bernet , Narayan Ranganathan , Karthik Kumar , Raj K. Ramanujan , Robert G. Blankenship
IPC: G06F12/0831 , G06F12/0813
Abstract: Systems, methods and apparatuses for distributed consistency memory. In some embodiments, the apparatus comprises at least one monitoring circuit to monitor for memory accesses to an address space; at least one a monitoring table to store an identifier of the address space; and at least one hardware core to execute an instruction to enable the monitoring circuit.
-
公开(公告)号:US20210294292A1
公开(公告)日:2021-09-23
申请号:US17330738
申请日:2021-05-26
Applicant: Intel Corporation
Inventor: Nicolas A. Salhuana , Karthik Kumar , Thomas Willhalm , Francesc Guim Bernat , Narayan Ranganathan
IPC: G05B19/042 , H03K19/17732 , G06F8/41 , H03K19/17728
Abstract: In one embodiment, an apparatus comprises a fabric controller of a first computing node. The fabric controller is to receive, from a second computing node via a network fabric that couples the first computing node to the second computing node, a request to execute a kernel on a field-programmable gate array (FPGA) of the first computing node; instruct the FPGA to execute the kernel; and send a result of the execution of the kernel to the second computing node via the network fabric.
-
公开(公告)号:US20210149815A1
公开(公告)日:2021-05-20
申请号:US17129496
申请日:2020-12-21
Applicant: Intel Corporation
Inventor: Saurabh Gayen , Philip R. Lantz , Dhananjay A. Joshi , Rupin H. Vakharwala , Rajesh M. Sankaran , Narayan Ranganathan , Sanjay Kumar
IPC: G06F12/10 , G06F12/0875 , G06F13/28 , G06F13/40 , G06F13/42
Abstract: Techniques for offload device address translation fetching are disclosed. In the illustrative embodiment, a processor of a compute device sends a translation fetch descriptor to an offload device before sending a corresponding work descriptor to the offload device. The offload device can request translations for virtual memory address and cache the corresponding physical addresses for later use. While the offload device is fetching virtual address translations, the compute device can perform other tasks before sending the corresponding work descriptor, including operations that modify the contents of the memory addresses whose translation are being cached. Even if the offload device does not cache the translations, the fetching can warm up the cache in a translation lookaside buffer. Such an approach can reduce the latency overhead that the offload device may otherwise incur in sending memory address translation requests that would be required to execute the work descriptor.
-
公开(公告)号:US20190155239A1
公开(公告)日:2019-05-23
申请号:US16314401
申请日:2016-06-30
Applicant: Intel Corporation
Inventor: Nicolas A. Salhuana , Karthik Kumar , Thomas Willhalm , Francesc Guim Bernat , Narayan Ranganathan
IPC: G05B19/042 , H03K19/177 , G06F8/41
Abstract: In one embodiment, an apparatus comprises a fabric controller of a first computing node. The fabric controller is to receive, from a second computing node via a network fabric that couples the first computing node to the second computing node, a request to execute a kernel on a field-programmable gate array (FPGA) of the first computing node; instruct the FPGA to execute the kernel; and send a result of the execution of the kernel to the second computing node via the network fabric.
-
28.
公开(公告)号:US10223187B2
公开(公告)日:2019-03-05
申请号:US15372734
申请日:2016-12-08
Applicant: INTEL CORPORATION
Inventor: Ashok Raj , Narayan Ranganathan , Mohan J. Kumar , Vincent J. Zimmer
Abstract: A processor includes an instruction decoder to receive an instruction to perform a machine check operation, the instruction having a first operand and a second operand. The processor further includes a machine check logic coupled to the instruction decoder to determine that the instruction is to determine a type of a machine check bank based on a command value stored in a first storage location indicated by the first operand, to determine a type of a machine check bank identified by a machine check bank identifier (ID) stored in a second storage location indicated by the second operand, and to store the determined type of the machine check bank in the first storage location indicated by the first operand.
-
29.
公开(公告)号:US09686143B2
公开(公告)日:2017-06-20
申请号:US14494892
申请日:2014-09-24
Applicant: INTEL CORPORATION
Inventor: Ramamurthy Krithivas , Narayan Ranganathan , Mohan J. Kumar , John C. Leung
IPC: H04L12/24 , H04L29/12 , H04L12/855
CPC classification number: H04L41/12 , H04L41/044 , H04L47/2466 , H04L61/2015 , H04L61/2507
Abstract: Mechanisms to enable management controllers to learn the control plane hierarchy in data center environments. The data center is configured in a physical hierarchy including multiple pods, racks, trays, and sleds and associated switches. Management controllers at various levels in a control plane hierarchy and associated with switches in the physical hierarchy are configured to add their IP addresses to DHCP (Dynamic Host Control Protocol) responses that are generated by a DCHP server in response to DCHP requests for IP address requests initiated by DHCP clients including manageability controllers, compute nodes and storage nodes in the data center. As the DCHP response traverses each of multiple switches along a forwarding path from the DCHP server to the DHCP client, an IP address of the manageability controller associated with the switch is inserted. Upon receipt at the DHCP client, the inserted IP addresses are extracted and used to automate learning of the control plane hierarchy.
-
公开(公告)号:US11693691B2
公开(公告)日:2023-07-04
申请号:US17381521
申请日:2021-07-21
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
CPC classification number: G06F9/48 , G06F9/3001 , G06F9/3004 , G06F9/30036 , G06F9/383
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
-
-
-
-
-
-
-
-