PROCESS ADDRESS SPACE IDENTIFIER VIRTUALIZATION USING HARDWARE PAGING HINT

    公开(公告)号:US20210271481A1

    公开(公告)日:2021-09-02

    申请号:US17253053

    申请日:2018-12-21

    Abstract: Process address space identifier virtualization uses hardware paging hint. The processing device (100) comprising: a processing core (110); and a translation circuit coupled to the processing core, the translation circuit to: receive a workload instruction from a guest application being executed by the processing device, the workload instruction comprising an untranslated guest process address space identifier (gPASID), a workload for an input/output (I/O) target device, and an identifier of a submission register on the I/O target device (410), access a paging data structure (PDS) associated with the guest application to retrieve a page table entry corresponding to the gPASID and the identifier of the submission register (420), determine a value of an I/O hint bit of the page table entry corresponding to the gPASID and the identifier of the submission register (430), responsive to determining that the I/O hint bit is enabled, keep the untranslated gPASID in the workload instruction (440), and provide the workload instruction to a work queue of the I/O target device (450)

    Highly scalable accelerator
    2.
    发明授权

    公开(公告)号:US11106613B2

    公开(公告)日:2021-08-31

    申请号:US15940128

    申请日:2018-03-29

    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.

    TECHNOLOGIES FOR OFFLOAD DEVICE FETCHING OF ADDRESS TRANSLATIONS

    公开(公告)号:US20210149815A1

    公开(公告)日:2021-05-20

    申请号:US17129496

    申请日:2020-12-21

    Abstract: Techniques for offload device address translation fetching are disclosed. In the illustrative embodiment, a processor of a compute device sends a translation fetch descriptor to an offload device before sending a corresponding work descriptor to the offload device. The offload device can request translations for virtual memory address and cache the corresponding physical addresses for later use. While the offload device is fetching virtual address translations, the compute device can perform other tasks before sending the corresponding work descriptor, including operations that modify the contents of the memory addresses whose translation are being cached. Even if the offload device does not cache the translations, the fetching can warm up the cache in a translation lookaside buffer. Such an approach can reduce the latency overhead that the offload device may otherwise incur in sending memory address translation requests that would be required to execute the work descriptor.

    SHARED ACCELERATOR MEMORY SYSTEMS AND METHODS

    公开(公告)号:US20200310993A1

    公开(公告)日:2020-10-01

    申请号:US16370587

    申请日:2019-03-29

    Abstract: The present disclosure is directed to systems and methods sharing memory circuitry between processor memory circuitry and accelerator memory circuitry in each of a plurality of peer-to-peer connected accelerator units. Each of the accelerator units includes physical-to-virtual address translation circuitry and migration circuitry. The physical-to-virtual address translation circuitry in each accelerator unit includes pages for each of at least some of the plurality of accelerator units. The migration circuitry causes the transfer of data between the processor memory circuitry and the accelerator memory circuitry in each of the plurality of accelerator circuits. The migration circuitry migrates and evicts data to/from accelerator memory circuitry based on statistical information associated with accesses to at least one of: processor memory circuitry or accelerator memory circuitry in one or more peer accelerator circuits. Thus, the processor memory circuitry and accelerator memory circuitry may be dynamically allocated to advantageously minimize system latency attributable to data access operations.

    Hardware apparatuses and methods for distributed durable and atomic transactions in non-volatile memory

    公开(公告)号:US10203910B2

    公开(公告)日:2019-02-12

    申请号:US15589653

    申请日:2017-05-08

    Abstract: Hardware apparatuses and methods for distributed durable and atomic transactions in non-volatile memory are described. In one embodiment, a hardware apparatus includes a hardware processor, a plurality of hardware memory controllers for each of a plurality of non-volatile data storage devices, and a plurality of staging buffers with a staging buffer for each of the plurality of hardware memory controllers, wherein each of the plurality of hardware memory controllers are to: write data of a data set that is to be written to the plurality of non-volatile data storage devices to their staging buffer, send confirmation to the hardware processor that the data is written to their staging buffer, and write the data from their staging buffer to their non-volatile data storage device on receipt of a commit command.

    COMPUTING METHOD AND APPARATUS WITH PERSISTENT MEMORY
    8.
    发明申请
    COMPUTING METHOD AND APPARATUS WITH PERSISTENT MEMORY 有权
    计算方法和设备与持续记忆

    公开(公告)号:US20160170645A1

    公开(公告)日:2016-06-16

    申请号:US14567662

    申请日:2014-12-11

    Abstract: Computer-readable storage media, computing apparatuses and methods associated with persistent memory are discussed herein. In embodiments, a computing apparatus may include one or more processors, along with a plurality of persistent storage modules that may be coupled with the one or more processors. The computing apparatus may further include system software, to be operated by the one or more processors, to receive volatile memory allocation requests and persistent storage allocation requests from one or more applications that may be executed by the one or more processors. The system software may then dynamically allocate memory pages of the persistent storage modules as: volatile type memory pages, in response to the volatile memory allocation requests, and persistent type memory pages, in response to the persistent storage allocation requests. Other embodiments may be described and/or claimed.

    Abstract translation: 本文讨论了与永久存储器相关联的计算机可读存储介质,计算设备和方法。 在实施例中,计算设备可以包括一个或多个处理器,以及可以与一个或多个处理器耦合的多个持久存储模块。 计算设备还可以包括由一个或多个处理器操作的系统软件,以从一个或多个处理器执行的一个或多个应用接收易失性存储器分配请求和持久存储分配请求。 然后,响应于持久存储分配请求,系统软件可以响应于易失性存储器分配请求和持久类型存储器页面动态分配持久存储模块的存储器页面:易失性存储器页面。 可以描述和/或要求保护其他实施例。

    Highly scalable accelerator
    10.
    发明授权

    公开(公告)号:US12045185B2

    公开(公告)日:2024-07-23

    申请号:US18296875

    申请日:2023-04-06

    CPC classification number: G06F13/364 G06F9/5027 G06F13/24

    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.

Patent Agency Ranking