-
公开(公告)号:US20210382836A1
公开(公告)日:2021-12-09
申请号:US17410063
申请日:2021-08-24
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F13/24 , G06F9/50
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
公开(公告)号:US11106613B2
公开(公告)日:2021-08-31
申请号:US15940128
申请日:2018-03-29
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F13/24 , G06F9/50
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
公开(公告)号:US20210149815A1
公开(公告)日:2021-05-20
申请号:US17129496
申请日:2020-12-21
Applicant: Intel Corporation
Inventor: Saurabh Gayen , Philip R. Lantz , Dhananjay A. Joshi , Rupin H. Vakharwala , Rajesh M. Sankaran , Narayan Ranganathan , Sanjay Kumar
IPC: G06F12/10 , G06F12/0875 , G06F13/28 , G06F13/40 , G06F13/42
Abstract: Techniques for offload device address translation fetching are disclosed. In the illustrative embodiment, a processor of a compute device sends a translation fetch descriptor to an offload device before sending a corresponding work descriptor to the offload device. The offload device can request translations for virtual memory address and cache the corresponding physical addresses for later use. While the offload device is fetching virtual address translations, the compute device can perform other tasks before sending the corresponding work descriptor, including operations that modify the contents of the memory addresses whose translation are being cached. Even if the offload device does not cache the translations, the fetching can warm up the cache in a translation lookaside buffer. Such an approach can reduce the latency overhead that the offload device may otherwise incur in sending memory address translation requests that would be required to execute the work descriptor.
-
公开(公告)号:US20240054011A1
公开(公告)日:2024-02-15
申请号:US18233308
申请日:2023-08-12
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Philip R. Lantz , Narayan Ranganathan , Saurabh Gayen , Sanjay Kumar , Nikhil Rao , Dhananjay A. Joshi , Hai Ming Khor , Utkarsh Y. Kakaiya
IPC: G06F9/48 , G06F9/50 , G06F12/0802
CPC classification number: G06F9/4881 , G06F9/5027 , G06F12/0802
Abstract: Methods and apparatus relating to data streaming accelerators are described. In an embodiment, a hardware accelerator such as a Data Streaming Accelerator (DSA) logic circuitry performs data movement and/or data transformation for data to be transferred between a processor (having one or more processor cores) and a storage device. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20230251986A1
公开(公告)日:2023-08-10
申请号:US18296875
申请日:2023-04-06
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F13/24 , G06F9/50
CPC classification number: G06F13/364 , G06F9/5027 , G06F13/24
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
公开(公告)号:US10969992B2
公开(公告)日:2021-04-06
申请号:US16236473
申请日:2018-12-29
Applicant: Intel Corporation
Inventor: Saurabh Gayen , Dhananjay A. Joshi , Philip R. Lantz , Rajesh M. Sankaran
IPC: G06F3/06 , G06F12/0862 , G06F12/1027 , G06F9/455
Abstract: Systems, methods, and devices can include a processing engine implemented at least partially in hardware, the processing engine to process memory transactions; a memory element to index physical address and virtual address translations; and a memory controller logic implemented at least partially in hardware, the memory controller logic to receive an index from the processing engine, the index corresponding to a physical address and a virtual address; identify a physical address based on the received index; and provide the physical address to the processing engine. The processing engine can use the physical address for memory transactions in response to a streaming workload job request.
-
公开(公告)号:US20250053530A1
公开(公告)日:2025-02-13
申请号:US18749130
申请日:2024-06-20
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F9/50 , G06F13/24
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
8.
公开(公告)号:US20230185603A1
公开(公告)日:2023-06-15
申请号:US17551166
申请日:2021-12-14
Applicant: Intel Corporation
Inventor: Saurabh Gayen , Philip Lantz , Narayan Ranganathan , Dhananjay Joshi , Rajesh Sankaran , Utkarsh Kakaiya
CPC classification number: G06F9/4881 , G06Q30/0283
Abstract: Methods and apparatus relating to dynamic capability discovery and enforcement for accelerators and devices in multi-tenant systems are described. In an embodiment, a hardware accelerator device advertises one or more available operations and/or capabilities of the hardware accelerator device to one or more tenants. Logic circuitry controls access to the one or more available operations and/or capabilities of the one or more work queues on a per-tenant basis. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11650947B2
公开(公告)日:2023-05-16
申请号:US17410063
申请日:2021-08-24
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F13/24 , G06F9/50
CPC classification number: G06F13/364 , G06F9/5027 , G06F13/24
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
公开(公告)号:US20230032236A1
公开(公告)日:2023-02-02
申请号:US17875198
申请日:2022-07-27
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Philip R. Lantz , Narayan Ranganathan , Saurabh Gayen , Sanjay Kumar , Nikhil Rao , Dhananjay A. Joshi , Hai Ming Khor , Utkarsh Y. Kakaiya
IPC: G06F3/06
Abstract: Methods and apparatus relating to data streaming accelerators are described. In an embodiment, a hardware accelerator such as a Data Streaming Accelerator (DSA) logic circuitry provides high-performance data movement and/or data transformation for data to be transferred between a processor (having one or more processor cores) and a storage device. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-