MESSAGE AGGREGATION, COMBINING AND COMPRESSION FOR EFFICIENT DATA COMMUNICATIONS IN GPU-BASED CLUSTERS
    12.
    发明申请
    MESSAGE AGGREGATION, COMBINING AND COMPRESSION FOR EFFICIENT DATA COMMUNICATIONS IN GPU-BASED CLUSTERS 审中-公开
    基于GPU的群集中的有效数据通信的消息聚合,组合和压缩

    公开(公告)号:US20160352598A1

    公开(公告)日:2016-12-01

    申请号:US15165953

    申请日:2016-05-26

    CPC classification number: H04L47/365

    Abstract: A system and method for efficient management of network traffic management of highly data parallel computing. A processing node includes one or more processors capable of generating network messages. A network interface is used to receive and send network messages across a network. The processing node reduces at least one of a number or a storage size of the original network messages into one or more new network messages. The new network messages are sent to the network interface to send across the network.

    Abstract translation: 一种高效数据并行计算网络流量管理高效管理的系统和方法。 处理节点包括能够生成网络消息的一个或多个处理器。 网络接口用于通过网络接收和发送网络消息。 处理节点将原始网络消息的数量或存储大小中的至少一个减少到一个或多个新的网络消息中。 新的网络消息被发送到网络接口以在网络上发送。

    PROCESSOR AND METHODS FOR REMOTE SCOPED SYNCHRONIZATION
    13.
    发明申请
    PROCESSOR AND METHODS FOR REMOTE SCOPED SYNCHRONIZATION 有权
    用于远程同步同步的处理器和方法

    公开(公告)号:US20160139624A1

    公开(公告)日:2016-05-19

    申请号:US14542042

    申请日:2014-11-14

    Abstract: Described herein is an apparatus and method for remote scoped synchronization, which is a new semantic that allows a work-item to order memory accesses with a scope instance outside of its scope hierarchy. More precisely, remote synchronization expands visibility at a particular scope to all scope-instances encompassed by that scope. Remote scoped synchronization operation allows smaller scopes to be used more frequently and defers added cost to only when larger scoped synchronization is required. This enables programmers to optimize the scope that memory operations are performed at for important communication patterns like work stealing. Executing memory operations at the optimum scope reduces both execution time and energy. In particular, remote synchronization allows a work-item to communicate with a scope that it otherwise would not be able to access. Specifically, work-items can pull valid data from and push updates to scopes that do not (hierarchically) contain them.

    Abstract translation: 这里描述的是一种用于远程作用域同步的装置和方法,它是一种新的语义,其允许工作项目使用其范围层级之外的范围实例来排序存储器访问。 更准确地说,远程同步将特定范围的可见性扩展到该范围包含的所有范围实例。 远程作用域同步操作允许更频繁地使用较小的范围,并且只有在需要较大的范围同步时才会降低增加的成本。 这使程序员可以优化执行存储器操作的范围,以便重要的通信模式,如工作窃取。 以最佳范围执行内存操作可以减少执行时间和能量。 特别地,远程同步允许工作项目与否则将无法访问的范围进行通信。 具体来说,工作项可以从不(分级)包含它们的范围提取有效的数据并将更新推送到范围。

    Conditional notification mechanism
    14.
    发明授权
    Conditional notification mechanism 有权
    条件通知机制

    公开(公告)号:US09256535B2

    公开(公告)日:2016-02-09

    申请号:US13856728

    申请日:2013-04-04

    Abstract: The described embodiments comprise a computing device with a first processor core and a second processor core. In some embodiments, during operations, the first processor core receives, from the second processor core, an indication of a memory location and a flag. The first processor core then stores the flag in a first cache line in a cache in the first processor core and stores the indication of the memory location separately in a second cache line in the cache. Upon encountering a predetermined result when evaluating a condition for the indicated memory location, the first processor core updates the flag in the first cache line. Based on the update of the flag, the first processor core causes the second processor core to perform an operation.

    Abstract translation: 所描述的实施例包括具有第一处理器核心和第二处理器核心的计算设备。 在一些实施例中,在操作期间,第一处理器核心从第二处理器核心接收存储器位置和标志的指示。 第一处理器核心然后将标志存储在第一处理器核心中的高速缓存中的第一高速缓存行中,并将存储器位置的指示分别存储在高速缓存中的第二高速缓存行中。 当在评估所指示的存储器位置的条件时遇到预定结果时,第一处理器核心更新第一高速缓存行中的标志。 基于标志的更新,第一处理器核心使得第二处理器核心执行操作。

    Wavefront Resource Virtualization
    15.
    发明申请
    Wavefront Resource Virtualization 审中-公开
    波前资源虚拟化

    公开(公告)号:US20150363903A1

    公开(公告)日:2015-12-17

    申请号:US14304483

    申请日:2014-06-13

    CPC classification number: G06T1/20

    Abstract: A processor comprising hardware logic configured to execute of a first wavefront in a hardware resource and stop execution of the first wavefront before the first wavefront completes. The processor schedules a second wavefront for execution in the hardware resource.

    Abstract translation: 一种处理器,包括硬件逻辑,其被配置为在硬件资源中执行第一波前,并且在所述第一波前完成之前停止所述第一波前的执行。 处理器调度第二个波阵面以在硬件资源中执行。

    Conditional Notification Mechanism
    16.
    发明申请
    Conditional Notification Mechanism 有权
    条件通知机制

    公开(公告)号:US20140304474A1

    公开(公告)日:2014-10-09

    申请号:US13856728

    申请日:2013-04-04

    Abstract: The described embodiments comprise a computing device with a first processor core and a second processor core. In some embodiments, during operations, the first processor core receives, from the second processor core, an indication of a memory location and a flag. The first processor core then stores the flag in a first cache line in a cache in the first processor core and stores the indication of the memory location separately in a second cache line in the cache. Upon encountering a predetermined result when evaluating a condition for the indicated memory location, the first processor core updates the flag in the first cache line. Based on the update of the flag, the first processor core causes the second processor core to perform an operation.

    Abstract translation: 所描述的实施例包括具有第一处理器核心和第二处理器核心的计算设备。 在一些实施例中,在操作期间,第一处理器核心从第二处理器核心接收存储器位置和标志的指示。 第一处理器核心然后将标志存储在第一处理器核心中的高速缓存中的第一高速缓存行中,并将存储器位置的指示分别存储在高速缓存中的第二高速缓存行中。 当在评估所指示的存储器位置的条件时遇到预定结果时,第一处理器核心更新第一高速缓存行中的标志。 基于标志的更新,第一处理器核心使得第二处理器核心执行操作。

    Conditional Notification Mechanism
    17.
    发明申请
    Conditional Notification Mechanism 有权
    条件通知机制

    公开(公告)号:US20140250312A1

    公开(公告)日:2014-09-04

    申请号:US13782117

    申请日:2013-03-01

    Abstract: The described embodiments comprise a first hardware context. The first hardware context receives, from a second hardware context, an indication of a memory location and a condition to be met by the memory location. The first hardware context then sends a signal to the second hardware context when the memory location meets the condition.

    Abstract translation: 所描述的实施例包括第一硬件上下文。 第一硬件上下文从第二硬件上下文接收存储器位置的指示和存储器位置要满足的条件。 当存储器位置满足条件时,第一硬件上下文然后向第二硬件上下文发送信号。

    Fragmented Channels
    18.
    发明申请
    Fragmented Channels 审中-公开
    分段频道

    公开(公告)号:US20140181822A1

    公开(公告)日:2014-06-26

    申请号:US13721219

    申请日:2012-12-20

    CPC classification number: G06F9/5016 G06F9/544

    Abstract: A system, method and a computer-readable medium for task scheduling using fragmented channels is provided. A plurality of fragmented channels are stored in memory accessible to a plurality of compute units. Each fragmented channel is associated with a particular compute unit. Each fragmented channel also stores a plurality of data items from tasks scheduled for processing on the associated compute unit and links to another fragmented channel in the plurality of fragmented channels.

    Abstract translation: 提供了使用分段信道进行任务调度的系统,方法和计算机可读介质。 多个分段信道被存储在多个计算单元可访问的存储器中。 每个分段信道与特定计算单元相关联。 每个分段信道还存储来自被调度用于在相关联的计算单元上处理的任务的多个数据项,并且链接到多个分段信道中的另一分段信道。

Patent Agency Ranking