REDUCING COLD TLB MISSES IN A HETEROGENEOUS COMPUTING SYSTEM
    1.
    发明申请
    REDUCING COLD TLB MISSES IN A HETEROGENEOUS COMPUTING SYSTEM 审中-公开
    减少异构计算系统中的冷TLB缺陷

    公开(公告)号:WO2014055264A1

    公开(公告)日:2014-04-10

    申请号:PCT/US2013/060826

    申请日:2013-09-20

    Abstract: Methods and apparatuses are provided for avoiding cold translation lookaside buffer (TLB) misses in a computer system. A typical system is configured as a heterogeneous computing system having at least one central processing unit (CPU) and one or more graphic processing units (GPUs) that share a common memory address space. Each processing unit (CPU and GPU) has an independent TLB. When offloading a task from a particular CPU to a particular GPU, translation information is sent along with the task assignment. The translation information allows the GPU to load the address translation data into the TLB associated with the one or more GPUs prior to executing the task. Preloading the TLB of the GPUs reduces or avoids cold TLB misses that could otherwise occur without the benefits offered by the present disclosure.

    Abstract translation: 提供了用于避免计算机系统中的冷翻译后备缓冲器(TLB)未命中的方法和装置。 典型的系统被配置为具有至少一个中央处理单元(CPU)和共享公共存储器地址空间的一个或多个图形处理单元(GPU)的异构计算系统。 每个处理单元(CPU和GPU)都有独立的TLB。 当将任务从特定CPU卸载到特定GPU时,将随任务分配一起发送翻译信息。 翻译信息允许GPU在执行任务之前将地址转换数据加载到与一个或多个GPU相关联的TLB中。 GPU的预加载减少或避免了在没有本公开提供的优点的情况下可能发生的冷TLB未命中。

    METHOD AND APPARATUS FOR TIME-BASED SCHEDULING OF TASKS
    2.
    发明申请
    METHOD AND APPARATUS FOR TIME-BASED SCHEDULING OF TASKS 审中-公开
    用于基于时间的任务调度的方法和设备

    公开(公告)号:WO2017099863A1

    公开(公告)日:2017-06-15

    申请号:PCT/US2016/052504

    申请日:2016-09-19

    CPC classification number: G06F9/5038 G06F9/4881 G06F2209/483

    Abstract: A computing device is disclosed. The computing device includes an Accelerated Processing Unit (APU) including at least a first Heterogeneous System Architecture (HSA) computing device and at least a second HSA computing device, the second computing device being a different type than the first computing device, and an HSA Memory Management Unit (HMMU) allowing the APU to communicate with at least one memory. The computing task is enqueued on an HSA-managed queue that is set to run on the at least first HSA computing device or the at least second HSA computing device. The computing task is re-enqueued on the HSA-managed queue based on a repetition flag that triggers the number of times the computing task is re-enqueued. The repetition field is decremented each time the computing task is re-enqueued. The repetition field may include a special value (e.g., -1) to allow re-enqueuing of the computing task indefinitely.

    Abstract translation: 公开了一种计算设备。 该计算设备包括加速处理单元(APU),该加速处理单元(APU)至少包括第一异构系统架构(HSA)计算设备和至少第二HSA计算设备,第二计算设备是与第一计算设备不同的类型,并且HSA 内存管理单元(HMMU)允许APU与至少一个内存进行通信。 计算任务被排列在被设置为在至少第一HSA计算设备或至少第二HSA计算设备上运行的HSA管理的队列上。 计算任务基于重复标志在HSA管理的队列上重新排队,该重复标志触发计算任务重新排队的次数。 每当计算任务重新排队时,重复字段将递减。 重复字段可以包括特殊值(例如,-1)以允许无限期地重新排列计算任务。

    GPU REMOTE COMMUNICATION WITH TRIGGERED OPERATIONS
    3.
    发明申请
    GPU REMOTE COMMUNICATION WITH TRIGGERED OPERATIONS 审中-公开
    带有触发操作的GPU远程通信

    公开(公告)号:WO2018075182A1

    公开(公告)日:2018-04-26

    申请号:PCT/US2017/052250

    申请日:2017-09-19

    Abstract: Methods, devices, and systems for transmitting data over a computer communications network are disclosed. A queue of communications commands can be pre-generated using a central processing unit (CPU) and stored in a device memory of a network interface controller (NIC). Thereafter, if a graphics processing unit (GPU) has data to communicate to a remote GPU, it can store the data in a send buffer, where the location in the buffer is pointed to by a pre-generated command. The GPU can then signal to the interface device that the data is ready, triggering execution of the pre-generated command to send the data.

    Abstract translation: 公开了用于通过计算机通信网络传输数据的方法,设备和系统。 通信命令队列可以使用中央处理单元(CPU)预先生成并存储在网络接口控制器(NIC)的设备存储器中。 此后,如果图形处理单元(GPU)具有要传送到远程GPU的数据,则其可以将数据存储在发送缓冲区中,其中缓冲区中的位置由预先生成的命令指向。 GPU然后可以向接口设备发信号通知数据已准备就绪,触发执行预先生成的命令来发送数据。

Patent Agency Ranking