DYNAMIC PINNING OF VIRTUAL PAGES SHARED BETWEEN DIFFERENT TYPE PROCESSORS OF A HETEROGENEOUS COMPUTING PLATFORM
    4.
    发明申请
    DYNAMIC PINNING OF VIRTUAL PAGES SHARED BETWEEN DIFFERENT TYPE PROCESSORS OF A HETEROGENEOUS COMPUTING PLATFORM 有权
    异步计算平台的不同类型处理器之间共享的虚拟页面动态拼接

    公开(公告)号:US20160154742A1

    公开(公告)日:2016-06-02

    申请号:US14862745

    申请日:2015-09-23

    IPC分类号: G06F12/10 G06F13/16

    摘要: A computer system may support one or more techniques to allow dynamic pinning of the memory pages accessed by a non-CPU device, such as a graphics processing unit (GPU). The non-CPU may support virtual to physical address mapping and may thus be aware of the memory pages, which may not be pinned but may be accessed by the non-CPU. The non-CPU may notify or send such information to a run-time component such as a device driver associated with the CPU. The device driver may, dynamically, perform pinning of such memory pages, which may be accessed by the non-CPU. The device driver may even unpin the memory pages, which may be no longer accessed by the non-CPU. Such an approach may allow the memory pages, which may be no longer accessed by the non-CPU to be available for allocation to the other CPUs and/or non-CPUs.

    摘要翻译: 计算机系统可以支持一种或多种技术来允许由诸如图形处理单元(GPU)的非CPU设备访问的存储器页的动态固定。 非CPU可以支持虚拟到物理地址映射,并且因此可以知道可能不被固定但可被非CPU访问的存储器页。 非CPU可以向诸如与CPU相关联的设备驱动程序的运行时组件通知或发送这样的信息。 设备驱动程序可以动态地执行可由非CPU访问的这种存储器页的钉扎。 设备驱动程序甚至可以取消内存页,这可能不再被非CPU访问。 这样的方法可以允许非CPU可以不再访问的存储器页面可用于分配给其他CPU和/或非CPU。

    Translation lookaside buffer for multiple context compute engine
    5.
    发明授权
    Translation lookaside buffer for multiple context compute engine 有权
    用于多个上下文计算引擎的翻译后备缓冲区

    公开(公告)号:US09152572B2

    公开(公告)日:2015-10-06

    申请号:US13993800

    申请日:2011-12-30

    IPC分类号: G06F12/00 G06F12/10 G06F12/08

    摘要: Some implementations disclosed herein provide techniques and arrangements for an specialized logic engine that includes translation lookaside buffer to support multiple threads executing on multiple cores. The translation lookaside buffer enables the specialized logic engine to directly access a virtual address of a thread executing on one of the plurality of processing cores. For example, an acceleration compute engine may receive one or more instructions from a thread executed by a processing core. The acceleration compute engine may retrieve, based on an address space identifier associated with the one or more instructions, a physical address associated with the one or more instructions from the translation lookaside buffer to execute the one or more instructions using the physical address.

    摘要翻译: 本文公开的一些实施例提供了专用逻辑引擎的技术和布置,其包括翻译后备缓冲器以支持在多个核上执行的多个线程。 翻译后备缓冲器使得专用逻辑引擎能够直接访问在多个处理核之一上执行的线程的虚拟地址。 例如,加速计算引擎可以从由处理核心执行的线程接收一个或多个指令。 所述加速度计算引擎可以基于与所述一个或多个指令相关联的地址空间标识符,从所述翻译后备缓冲器中检索与所述一个或多个指令相关联的物理地址,以使用所述物理地址来执行所述一个或多个指令。

    2-D gather instruction and a 2-D cache
    6.
    发明授权
    2-D gather instruction and a 2-D cache 有权
    2-D收集指令和2-D缓存

    公开(公告)号:US09001138B2

    公开(公告)日:2015-04-07

    申请号:US13220402

    申请日:2011-08-29

    IPC分类号: G09G5/36 G06T1/60 G06F12/08

    摘要: A processor may support a two-dimensional (2-D) gather instruction and a 2-D cache. The processor may perform the 2-D gather instruction to access one or more sub-blocks of data from a two-dimensional (2-D) image stored in a memory coupled to the processor. The two-dimensional (2-D) cache may store the sub-blocks of data in a multiple cache lines. Further, the 2-D cache may support access of more than one cache lines while preserving a two-dimensional structure of the 2-D image.

    摘要翻译: 处理器可以支持二维(2-D)采集指令和2-D缓存。 处理器可以执行2-D收集指令以从存储在耦合到处理器的存储器中的二维(2-D)图像访问数据的一个或多个子块。 二维(2-D)高速缓存可以将数据子块存储在多个高速缓存行中。 此外,2-D缓存可以支持多个缓存行的访问,同时保留二维图像的二维结构。

    AGGREGATED PAGE FAULT SIGNALING AND HANDLINE
    8.
    发明申请
    AGGREGATED PAGE FAULT SIGNALING AND HANDLINE 有权
    聚合页错误信号和手段

    公开(公告)号:US20140304559A1

    公开(公告)日:2014-10-09

    申请号:US13977106

    申请日:2011-12-29

    IPC分类号: G06F11/07

    摘要: A processor of an aspect includes an instruction pipeline to process a multiple memory address instruction that indicates multiple memory addresses. The processor also includes multiple page fault aggregation logic coupled with the instruction pipeline. The multiple page fault aggregation logic is to aggregate page fault information for multiple page faults that are each associated with one of the multiple memory addresses of the instruction. The multiple page fault aggregation logic is to provide the aggregated page fault information to a page fault communication interface. Other processors, apparatus, methods, and systems are also disclosed.

    摘要翻译: 一方面的处理器包括处理指示多个存储器地址的多存储器地址指令的指令流水线。 处理器还包括与指令流水线相结合的多页故障聚合逻辑。 多页面故障聚合逻辑是针对与指令的多个存储器地址之一相关联的多个页面故障聚合页面故障信息。 多页面故障聚合逻辑是为页面故障通信接口提供聚合页面故障信息。 还公开了其他处理器,装置,方法和系统。

    METHOD AND APPARATUS FOR PERFORMANCE EFFICIENT ISA VIRTUALIZATION USING DYNAMIC PARTIAL BINARY TRANSLATION
    9.
    发明申请
    METHOD AND APPARATUS FOR PERFORMANCE EFFICIENT ISA VIRTUALIZATION USING DYNAMIC PARTIAL BINARY TRANSLATION 有权
    使用动态部分二进制翻译执行有效的ISA虚拟化的方法和装置

    公开(公告)号:US20140095832A1

    公开(公告)日:2014-04-03

    申请号:US13632089

    申请日:2012-09-30

    IPC分类号: G06F9/30 G06F9/312

    摘要: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.

    摘要翻译: 公开了用于本地指令集的虚拟化的方法,装置和系统。 实施例包括执行本地指令的处理器核心和第二核心,或者替代地,只有第二处理器核心在执行排除本地指令集的部分的第二指令集时消耗较少的功率。 第二核心解码器检测第二指令集的无效操作码。 微码层拆解器确定是否应翻译操作码。 翻译运行时环境识别包含第二指令集的无效操作码,其他无效操作码和中间有效操作码的可执行区域。 分析单元在执行无效操作码之前确定初始机器状态。 生成可执行区域的部分翻译,其中包括无效操作码的翻译和机器状态的状态恢复的封装,并将其保存到翻译高速缓冲存储器。

    Extension of CPU Context-State Management for Micro-Architecture State
    10.
    发明申请
    Extension of CPU Context-State Management for Micro-Architecture State 有权
    扩展用于微架构状态的CPU上下文状态管理

    公开(公告)号:US20140006758A1

    公开(公告)日:2014-01-02

    申请号:US13538252

    申请日:2012-06-29

    IPC分类号: G06F9/312

    摘要: A processor saves micro-architectural contexts to increase the efficiency of code execution and power management. A save instruction is executed to store a micro-architectural state and an architectural state of a processor in a common buffer of a memory upon a context switch that suspends the execution of a process. The micro-architectural state contains performance data resulting from the execution of the process. A restore instruction is executed to retrieve the micro-architectural state and the architectural state from the common buffer upon a resumed execution of the process. Power management hardware then uses the micro-architectural state as an intermediate starting point for the resumed execution.

    摘要翻译: 处理器可以节省微架构上下文以提高代码执行和电源管理的效率。 执行保存指令以在停止进程的执行的上下文切换时将微架构状态和处理器的体系结构状态存储在存储器的公共缓冲器中。 微架构状态包含执行该过程所产生的性能数据。 执行恢复指令以在恢复执行该过程时从公共缓冲器检索微架构状态和架构状态。 电源管理硬件然后使用微架构状态作为恢复执行的中间起点。