-
公开(公告)号:US20210271481A1
公开(公告)日:2021-09-02
申请号:US17253053
申请日:2018-12-21
Applicant: Intel Corporation
Inventor: Kun Tian , Sanjay Kumar , Ashok Raj , Yi Liu , Rajesh M. Sankaran , Philip R. Lantz
IPC: G06F9/34 , G06F9/455 , G06F9/38 , G06F13/42 , G06F12/1009
Abstract: Process address space identifier virtualization uses hardware paging hint. The processing device (100) comprising: a processing core (110); and a translation circuit coupled to the processing core, the translation circuit to: receive a workload instruction from a guest application being executed by the processing device, the workload instruction comprising an untranslated guest process address space identifier (gPASID), a workload for an input/output (I/O) target device, and an identifier of a submission register on the I/O target device (410), access a paging data structure (PDS) associated with the guest application to retrieve a page table entry corresponding to the gPASID and the identifier of the submission register (420), determine a value of an I/O hint bit of the page table entry corresponding to the gPASID and the identifier of the submission register (430), responsive to determining that the I/O hint bit is enabled, keep the untranslated gPASID in the workload instruction (440), and provide the workload instruction to a work queue of the I/O target device (450)
-
公开(公告)号:US11106613B2
公开(公告)日:2021-08-31
申请号:US15940128
申请日:2018-03-29
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F13/24 , G06F9/50
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
公开(公告)号:US20210149815A1
公开(公告)日:2021-05-20
申请号:US17129496
申请日:2020-12-21
Applicant: Intel Corporation
Inventor: Saurabh Gayen , Philip R. Lantz , Dhananjay A. Joshi , Rupin H. Vakharwala , Rajesh M. Sankaran , Narayan Ranganathan , Sanjay Kumar
IPC: G06F12/10 , G06F12/0875 , G06F13/28 , G06F13/40 , G06F13/42
Abstract: Techniques for offload device address translation fetching are disclosed. In the illustrative embodiment, a processor of a compute device sends a translation fetch descriptor to an offload device before sending a corresponding work descriptor to the offload device. The offload device can request translations for virtual memory address and cache the corresponding physical addresses for later use. While the offload device is fetching virtual address translations, the compute device can perform other tasks before sending the corresponding work descriptor, including operations that modify the contents of the memory addresses whose translation are being cached. Even if the offload device does not cache the translations, the fetching can warm up the cache in a translation lookaside buffer. Such an approach can reduce the latency overhead that the offload device may otherwise incur in sending memory address translation requests that would be required to execute the work descriptor.
-
公开(公告)号:US20200310993A1
公开(公告)日:2020-10-01
申请号:US16370587
申请日:2019-03-29
Applicant: INTEL CORPORATION
Inventor: Sanjay Kumar , David Koufaty , Philip Lantz , Pratik Marolia , Rajesh Sankaran , Koen Koning
IPC: G06F13/16 , G06F12/1027 , G06F3/06
Abstract: The present disclosure is directed to systems and methods sharing memory circuitry between processor memory circuitry and accelerator memory circuitry in each of a plurality of peer-to-peer connected accelerator units. Each of the accelerator units includes physical-to-virtual address translation circuitry and migration circuitry. The physical-to-virtual address translation circuitry in each accelerator unit includes pages for each of at least some of the plurality of accelerator units. The migration circuitry causes the transfer of data between the processor memory circuitry and the accelerator memory circuitry in each of the plurality of accelerator circuits. The migration circuitry migrates and evicts data to/from accelerator memory circuitry based on statistical information associated with accesses to at least one of: processor memory circuitry or accelerator memory circuitry in one or more peer accelerator circuits. Thus, the processor memory circuitry and accelerator memory circuitry may be dynamically allocated to advantageously minimize system latency attributable to data access operations.
-
公开(公告)号:US10203910B2
公开(公告)日:2019-02-12
申请号:US15589653
申请日:2017-05-08
Applicant: Intel Corporation
Inventor: Subramanya R. Dulloor , Rajesh M. Sankaran , Sanjay Kumar
IPC: G06F3/06 , G06F12/0891 , G06F12/02 , G06F12/0868
Abstract: Hardware apparatuses and methods for distributed durable and atomic transactions in non-volatile memory are described. In one embodiment, a hardware apparatus includes a hardware processor, a plurality of hardware memory controllers for each of a plurality of non-volatile data storage devices, and a plurality of staging buffers with a staging buffer for each of the plurality of hardware memory controllers, wherein each of the plurality of hardware memory controllers are to: write data of a data set that is to be written to the plurality of non-volatile data storage devices to their staging buffer, send confirmation to the hardware processor that the data is written to their staging buffer, and write the data from their staging buffer to their non-volatile data storage device on receipt of a commit command.
-
公开(公告)号:US20170132128A1
公开(公告)日:2017-05-11
申请号:US15411658
申请日:2017-01-20
Applicant: Intel Corporation
Inventor: Sanjay Kumar , Rajesh M. Sankaran , Subramanya R. Dulloor , Andrew V. Anderson
IPC: G06F12/0804 , G06F11/14
CPC classification number: G06F12/0804 , G06F9/467 , G06F11/07 , G06F11/073 , G06F11/0778 , G06F11/0793 , G06F11/14 , G06F11/1482 , G06F12/0868 , G06F2201/805 , G06F2201/82 , G06F2212/1032 , G06F2212/608
Abstract: A processor includes a memory management unit and a front end including a decoder. The decoder includes logic to receive a flush-on-commit (FoC) instruction to flush dirty data from a volatile cache to a persistent memory upon commitment of a store associated with the FoC instruction. The memory management unit includes logic to, based upon a flush-on-fail (FoF) mode, skip execution of the flush-on-commit instruction and to flush the dirty data from the volatile cache upon a subsequent FoF operation.
-
公开(公告)号:US09645939B2
公开(公告)日:2017-05-09
申请号:US14752783
申请日:2015-06-26
Applicant: Intel Corporation
Inventor: Subramanya R. Dulloor , Rajesh M Sankaran , Sanjay Kumar
IPC: G06F12/0891 , G06F3/06
CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/061 , G06F3/0619 , G06F3/065 , G06F3/0656 , G06F3/0679 , G06F3/0688 , G06F12/0246 , G06F12/0868 , G06F12/0891 , G06F2212/214 , G06F2212/7201
Abstract: Hardware apparatuses and methods for distributed durable and atomic transactions in non-volatile memory are described. In one embodiment, a hardware apparatus includes a hardware processor, a plurality of hardware memory controllers for each of a plurality of non-volatile data storage devices, and a plurality of staging buffers with a staging buffer for each of the plurality of hardware memory controllers, wherein each of the plurality of hardware memory controllers are to: write data of a data set that is to be written to the plurality of non-volatile data storage devices to their staging buffer, send confirmation to the hardware processor that the data is written to their staging buffer, and write the data from their staging buffer to their non-volatile data storage device on receipt of a commit command.
-
公开(公告)号:US20160170645A1
公开(公告)日:2016-06-16
申请号:US14567662
申请日:2014-12-11
Applicant: Intel Corporation
Inventor: Sanjay Kumar , Rajesh M. Sankaran , Subramanya R. Dulloor , Dheeraj R. Subbareddy , Andrew V. Anderson
IPC: G06F3/06
CPC classification number: G06F12/0842 , G06F12/0238 , G06F12/06 , G06F12/0897 , G06F12/1009 , G06F2212/225 , G06F2212/601 , G06F2212/7201
Abstract: Computer-readable storage media, computing apparatuses and methods associated with persistent memory are discussed herein. In embodiments, a computing apparatus may include one or more processors, along with a plurality of persistent storage modules that may be coupled with the one or more processors. The computing apparatus may further include system software, to be operated by the one or more processors, to receive volatile memory allocation requests and persistent storage allocation requests from one or more applications that may be executed by the one or more processors. The system software may then dynamically allocate memory pages of the persistent storage modules as: volatile type memory pages, in response to the volatile memory allocation requests, and persistent type memory pages, in response to the persistent storage allocation requests. Other embodiments may be described and/or claimed.
Abstract translation: 本文讨论了与永久存储器相关联的计算机可读存储介质,计算设备和方法。 在实施例中,计算设备可以包括一个或多个处理器,以及可以与一个或多个处理器耦合的多个持久存储模块。 计算设备还可以包括由一个或多个处理器操作的系统软件,以从一个或多个处理器执行的一个或多个应用接收易失性存储器分配请求和持久存储分配请求。 然后,响应于持久存储分配请求,系统软件可以响应于易失性存储器分配请求和持久类型存储器页面动态分配持久存储模块的存储器页面:易失性存储器页面。 可以描述和/或要求保护其他实施例。
-
公开(公告)号:US20240338319A1
公开(公告)日:2024-10-10
申请号:US18745603
申请日:2024-06-17
Applicant: Intel Corporation
Inventor: Utkarsh Y. Kakaiya , Sanjay Kumar , Rajesh M. Sankaran , Philip R. Lantz , Ashok Raj , Kun Tian
IPC: G06F12/1009 , G06F9/455 , G06F12/06 , G06F12/1081
CPC classification number: G06F12/1009 , G06F9/45558 , G06F12/063 , G06F12/1081 , G06F2009/45579 , G06F2009/45583 , G06F2009/45591
Abstract: Embodiments of apparatuses, methods, and systems for unified address translation for virtualization of input/output devices are described. In an embodiment, an apparatus includes first circuitry to use at least an identifier of a device to locate a context entry and second circuitry to use at least a process address space identifier (PASID) to locate a PASID-entry. The context entry is to include at least one of a page-table pointer to a page-table translation structure and a PASID. The PASID-entry is to include at least one of a first-level page-table pointer to a first-level translation structure and a second-level page-table pointer to a second-level translation structure. The PASID is to be supplied by the device. At least one of the apparatus, the context entry, and the PASID entry is to include one or more control fields to indicate whether the first-level page-table pointer or the second-level page-table pointer is to be used.
-
公开(公告)号:US12045185B2
公开(公告)日:2024-07-23
申请号:US18296875
申请日:2023-04-06
Applicant: Intel Corporation
Inventor: Philip R. Lantz , Sanjay Kumar , Rajesh M. Sankaran , Saurabh Gayen
IPC: G06F13/364 , G06F9/50 , G06F13/24
CPC classification number: G06F13/364 , G06F9/5027 , G06F13/24
Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
-
-
-
-
-
-
-
-
-