Abstract:
A method for steering data for an I/O write operation (144) includes, in response to receiving the I/O write operation, identifying, at an interconnect fabric (102), a cache (122, 123, 124, 126) as a target cache for steering the data based on at least one of: a software-provided steering indicator, a steering configuration (156) implemented at boot initialization, and coherency information for a cacheline associated with the data. The method further includes directing the identified target cache to cache the data from the I/O write operation. The data is temporarily buffered at the interconnect fabric, and if the target cache attempts to fetch the data via a fetch operation (152) while the data is still buffered at the interconnect fabric, the interconnect fabric provides a copy of the buffered data in response to the fetch operation instead of initiating a memory access operation to obtain the data from memory.
Abstract:
Methods, systems, and apparatuses provide support for multiple address spaces in order to facilitate data movement. One system includes a host processor; a memory; a data fabric coupled to the host processor and to the memory; a plurality of input/output memory manage units (IOMMUs), each of the plurality of IOMMUs coupled to the data fabric; a plurality of root ports, each of the root ports coupled to a corresponding IOMMU of the plurality of IOMMUs; and a plurality of peripheral component endpoints, each of the plurality of peripheral component endpoints coupled to a corresponding root port of the plurality of root ports, wherein each of the root ports comprises hardware control logic operative to: synchronize the plurality of root ports; receive, from the corresponding peripheral component endpoint, a direct memory access (DMA) request; and provide the DMA request to the corresponding IOMMU of the plurality of IOMMUs.
Abstract:
The present system enables an input/output (I/O) device to request memory for performing a direct memory access (DMA) of system memory. Further, the system uses an input/output memory management unit (IOMMU) to determine whether or not the system memory is available. The IOMMU notifies an operating system associated with the system memory if the system memory is not available, such that the operating system allocates non- system memory for use by the I/O device to perform the DMA.
Abstract:
Methods and apparatus for providing page migration of pages among tiered memories identify frequently accessed memory pages in each memory tier and generate page hotness ranking information indicating how frequently memory pages are being accessed. Methods and apparatus provide the page hotness ranking information to an operating system or hypervisor depending on which is used in the system, the operating system or hypervisor issues a page move command to a hardware data mover, based on the page hotness ranking information and the hardware data mover moves a memory page to a different memory tier in response to the page move command from the operating system.
Abstract:
A networked input/output memory management unit (IOMMU) includes a plurality of IOMMUs. The networked IOMMU receives a memory access request that includes a domain physical address generated by a first address translation layer. The networked IOMMU selectively translates the domain physical address into a physical address in a system memory using one of the plurality of IOMMUs that is selected based on a type of a device that generated the memory access request. In some cases, the networked IOMMU is connected to a graphics processing unit (GPU), at least one peripheral device, and the memory. The networked IOMMU includes a command queue to receive the memory access requests, a primary IOMMU to selectively translate the domain physical address in memory access requests from the GPU, and a secondary IOMMU to translate the domain physical address in memory requests from the peripheral device.
Abstract:
Bus protocol features are provided for chaining memory access requests on a high speed interconnect bus, allowing for reduced signaling overhead. Multiple memory request messages are received over a bus. A first message has a source identifier, a target identifier, a first address, and first payload data. The first payload data is stored in a memory at locations indicated by the first address. Within a selected second one of the request messages, a chaining indicator is received associated with the first request message and second payload data. The second request message does not include an address. Based on the chaining indicator, a second address for which memory access is requested is calculated based on the first address. The second payload data is stored in the memory at locations indicated by the second address.
Abstract:
Embodiments of the present invention provide methods, systems, and computer readable media for input output memory management unit (IOMMU) two-layer addressing in the context of memory address translations for I/O devices. According to an embodiment, a method includes translating a guest virtual address (GVA) to a corresponding guest physical address (GPA) using a guest address translation table according to a process address space identifier associated with an address translation transaction associated with an I/O device, and translating the GPA to a corresponding system physical address (SPA) using a system address translation table according to a device identifier associated with the address translation transaction.
Abstract:
A processor [102] employs a hardware encryption module [120] in the memory access path between an input/out device [106] and memory [104] to cryptographically isolate secure information. In some embodiments, the encryption module is located at a memory controller [116] of the processor, and each memory access request provided to the memory controller includes VM tag value identifying the source of the memory access request. The VM tag is determined based on a requestor id identifying the source of the memory access request. The encryption module performs encryption (for write accesses) or decryption (for read accesses) of the data associated with the memory access based on an encryption key associated with the VM tag.
Abstract:
The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/ probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests.
Abstract:
The present system enables passing a pointer, associated with accessing data in a memory, to an input/output (I/O) device via an input/output memory management unit (IOMMU). The I/O device accesses the data in the memory via the IOMMU without copying the data into a local I/O device memory. The I/O device can perform an operation on the data in the memory based on the pointer, such that I/O device accesses the memory without expensive copies.
Abstract translation:本系统使得能够通过输入/输出存储器管理单元(IOMMU)将与访问存储器中的数据相关联的指针传递到输入/输出(I / O)设备。 I / O设备通过IOMMU访问存储器中的数据,而不将数据复制到本地I / O设备存储器中。 I / O设备可以基于指针对存储器中的数据执行操作,使得I / O设备访问存储器而不需要昂贵的副本。