System and Method for Performing Dynamic Request Routing Based on Broadcast Queue Depths
    71.
    发明申请
    System and Method for Performing Dynamic Request Routing Based on Broadcast Queue Depths 失效
    基于广播队列深度执行动态请求路由的系统和方法

    公开(公告)号:US20090198957A1

    公开(公告)日:2009-08-06

    申请号:US12024514

    申请日:2008-02-01

    IPC分类号: G06F15/76 G06F9/06

    CPC分类号: G06F15/17

    摘要: A system and method for performing dynamic request routing based on broadcast depth queue information are provided. Each processor chip in the system may use a synchronized heartbeat signal it generates to provide queue depth information to each of the other processor chips in the system. The queue depth information identifies a number of requests or amount of data in each of the queues of a processor chip that originated the heartbeat signal. The queue depth information from each of the processor chips in the system may be used by the processor chips in determining optimal routing paths for data from a source processor chip to a destination processor chip. As a result, the congestion of data for processing at each of the processor chips along each possible routing path may be taken into account when selecting to which processor chip to forward data.

    摘要翻译: 提供了一种基于广播深度队列信息进行动态请求路由的系统和方法。 系统中的每个处理器芯片可以使用其产生的同步心跳信号来向系统中的每个其他处理器芯片提供队列深度信息。 队列深度信息识别发起心跳信号的处理器芯片的每个队列中的数量的请求或数据量。 系统中每个处理器芯片的队列深度信息可被处理器芯片用于确定用于从源处理器芯片到目的地处理器芯片的数据的最佳路由路径。 结果,当选择哪个处理器芯片来转发数据时,可以考虑在每个可能的路由路径处的每个处理器芯片处理数据的拥塞。

    Method and Apparatus for Handling Multiple Memory Requests Within a Multiprocessor System
    73.
    发明申请
    Method and Apparatus for Handling Multiple Memory Requests Within a Multiprocessor System 有权
    在多处理器系统中处理多个存储器请求的方法和装置

    公开(公告)号:US20090198933A1

    公开(公告)日:2009-08-06

    申请号:US12024181

    申请日:2008-02-01

    IPC分类号: G06F12/14

    CPC分类号: G06F9/526

    摘要: A method for handling multiple memory requests within a multi-processor system is disclosed. A lock control section is initially assigned to a data block within a system memory. In response to a request for accessing the data block by a processing unit, a determination is made whether or not the lock control section of the data block has been set. If the lock control section has been set, another determination is made whether or not the requesting processing unit is located beyond a predetermined distance from a memory controller. If the requesting processing unit is located beyond a predetermined distance from the memory controller, the requesting processing unit is invited to perform other functions; otherwise, the number of the requesting processing unit is placed in a queue table. However, if the lock control section has not been set, the lock control section of the data block is set, and the access request is allowed.

    摘要翻译: 公开了一种在多处理器系统内处理多个存储器请求的方法。 锁控制部分最初被分配给系统存储器内的数据块。 响应于由处理单元访问数据块的请求,确定数据块的锁定控制部分是否已经被设置。 如果已经设置了锁定控制部分,则另外确定请求处理单元是否位于距离存储器控制器超过预定距离的位置。 如果请求处理单元位于距存储器控制器超过预定距离的位置,则请求处理单元被邀请执行其他功能; 否则,请求处理单元的号码被放置在队列表中。 然而,如果锁定控制部分尚未设置,则数据块的锁定控制部分被设置,并且允许访问请求。

    System and Method for Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks
    74.
    发明申请
    System and Method for Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks 有权
    处理器正在执行消息传递接口任务时执行用于接收不同数据量的设置操作的系统和方法

    公开(公告)号:US20090064167A1

    公开(公告)日:2009-03-05

    申请号:US11846154

    申请日:2007-08-28

    IPC分类号: G06F9/46

    CPC分类号: G06F9/522 G06F9/5083

    摘要: A system and method are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.

    摘要翻译: 提供了一种系统和方法,用于在处理器执行消息传递接口(MPI)任务时执行用于接收不同数量的数据的建立操作。 提供了用于调整处理器的处理工作负载的平衡的机制,以便最小化等待所有处理器调用同步操作的等待时间。 MPI负载平衡控制器维护一个历史记录,提供关于其对同步操作的调用的任务简档。 根据该信息,可以确定哪些处理器应该减轻其处理负载,哪些处理器能够处理额外的处理负载,而不会对并行执行系统的整体操作产生显着的负面影响。 结果,可以在处理器正在执行MPI任务以准备在基于历史的后续计算周期中接收不同大小的数据部分时执行设置操作。

    System for Providing a Cluster-Wide System Clock in a Multi-Tiered Full-Graph Interconnect Architecture
    75.
    发明申请
    System for Providing a Cluster-Wide System Clock in a Multi-Tiered Full-Graph Interconnect Architecture 有权
    在多层全图互连架构中提供集群宽系统时钟的系统

    公开(公告)号:US20090063886A1

    公开(公告)日:2009-03-05

    申请号:US11848440

    申请日:2007-08-31

    IPC分类号: G06F1/12

    摘要: A system for providing a cluster-wide system clock in a multi-tiered full graph (MTFG) interconnect architecture are provided. Heartbeat signals transmitted by each of the processor chips in the computing cluster are synchronized. Internal system clock signals are generated in each of the processor chips based on the synchronized heartbeat signals. As a result, the internal system clock signals of each of the processor chips are synchronized since the heartbeat signals, that are the basis for the internal system clock signals, are synchronized. Mechanisms are provided for performing such synchronization using direct couplings of processor chips within the same processor book, different processor books in the same supernode, and different processor books in different supernodes of the MTFG interconnect architecture.

    摘要翻译: 提供了一种用于在多层全图(MTFG)互连架构中提供集群范围的系统时钟的系统。 计算群集中的每个处理器芯片发送的心跳信号同步。 基于同步的心跳信号,在每个处理器芯片中产生内部系统时钟信号。 结果,每个处理器芯片的内部系统时钟信号被同步,因为作为内部系统时钟信号的基础的心跳信号被同步。 提供了用于使用同一处理器书中的处理器芯片的直接耦合,同一超级节点中的不同处理器书以及MTFG互连体系结构的不同超节点中的不同处理器簿来执行这种同步的机制。

    System and Method for Providing Full Hardware Support of Collective Operations in a Multi-Tiered Full-Graph Interconnect Architecture
    76.
    发明申请
    System and Method for Providing Full Hardware Support of Collective Operations in a Multi-Tiered Full-Graph Interconnect Architecture 失效
    在多层全图互连架构中提供集体操作的全面硬件支持的系统和方法

    公开(公告)号:US20090063815A1

    公开(公告)日:2009-03-05

    申请号:US11845223

    申请日:2007-08-27

    IPC分类号: G06F15/76 G06F9/02

    CPC分类号: G06F15/17381

    摘要: A method, computer program product, and system are provided for performing collective operations. In hardware of a parent processor in a first processor book, a number of other processors are determined in a same or different processor book of the data processing system that is needed to execute the collective operation, thereby establishing a plurality of processors comprising the parent processor and the other processors. In hardware of the parent processor, the plurality of processors are logically arranged as a plurality of nodes in a hierarchical structure. The collective operation is transmitted to the plurality of processors based on the hierarchical structure. In hardware of the parent processor, results are received from the execution of the collective operation from the other processors, a final result is generated of the collective operation based on the received results, and the final result is output.

    摘要翻译: 提供了一种执行集体操作的方法,计算机程序产品和系统。 在第一处理器书中的母处理器的硬件中,在执行集体操作所需的数据处理系统的相同或不同的处理器簿中确定多个其他处理器,由此建立多个处理器,其包括母处理器 和其他处理器。 在母处理器的硬件中,多个处理器在逻辑上被布置为分层结构中的多个节点。 基于层次结构将集体操作发送到多个处理器。 在母处理器的硬件中,从其他处理器的集体操作的执行中接收到结果,基于接收到的结果生成集合操作的最终结果,并输出最终结果。

    Binding a process to a special purpose processing element having characteristics of a processor
    78.
    发明授权
    Binding a process to a special purpose processing element having characteristics of a processor 有权
    将过程绑定到具有处理器特征的专用处理元件

    公开(公告)号:US08893126B2

    公开(公告)日:2014-11-18

    申请号:US12024220

    申请日:2008-02-01

    IPC分类号: G06F9/00 G06F13/12

    CPC分类号: G06F13/12

    摘要: A heterogeneous processing element model is provided where I/O devices look and act like processors. In order to be treated like a processor, an I/O processing element, or other special purpose processing element, must follow some rules and have some characteristics of a processor, such as address translation, security, interrupt handling, and exception processing, for example. The heterogeneous processing element model puts special purpose processing elements on the same playing field as processors, from a programming perspective, operating system perspective, and power perspective. The operating system can get work to a security engine, for example, in the same way it does to a processor.

    摘要翻译: 提供异构处理元件模型,其中I / O设备看起来像处理器一样操作。 为了像处理器一样处理I / O处理元件或其他专用处理元件,必须遵循一些规则并具有处理器的某些特性,例如地址转换,安全性,中断处理和异常处理,用于 例。 异构处理元素模型将特殊处理元素与编程角度,操作系统角度和功能视角相结合,将处理器与处理器相同。 操作系统可以使用安全引擎,例如,与处理器相同。

    Claiming coherency ownership of a partial cache line of data
    79.
    发明授权
    Claiming coherency ownership of a partial cache line of data 有权
    声称部分高速缓存行数据的一致性所有权

    公开(公告)号:US08255635B2

    公开(公告)日:2012-08-28

    申请号:US12024392

    申请日:2008-02-01

    IPC分类号: G06F12/04

    CPC分类号: G06F12/0831

    摘要: According to method of data processing in a multiprocessor data processing system, in response to a processor request to modify a target granule of a target cache line of data containing multiple granules, a processing unit originates on an interconnect of the multiprocessor data processing system a data-claim-partial request that requests permission to promote only the target granule of the target cache line to a unique copy with an intent to modify the target granule. In response to a combined response to the data-claim-partial request indicating success (the combined response representing a system-wide response to the data-claim-partial-request), the processing unit promotes only the target granule of the target cache line to a unique copy by updating a coherency state of the target granule and retaining a coherency state of at least one other granule of the target cache line.

    摘要翻译: 根据多处理器数据处理系统中的数据处理方法,响应于修改目标高速缓存行数据包含多个粒子的处理器请求,处理单元在多处理器数据处理系统的互连上产生数据 - 要求 - 部分请求,请求仅将目标缓存行的目标颗粒推广到具有修改目标颗粒的唯一副本的权限。 响应于表示成功的数据声明部分请求的组合响应(表示对数据声明部分请求的系统范围响应的组合响应),处理单元仅促进目标高速缓存行的目标颗粒 通过更新目标颗粒的相关性状态并保持目标高速缓存行的至少一个其他颗粒的一致性状态来发送到唯一的副本。

    Method for data processing using a multi-tiered full-graph interconnect architecture
    80.
    发明授权
    Method for data processing using a multi-tiered full-graph interconnect architecture 失效
    使用多层全图互连架构进行数据处理的方法

    公开(公告)号:US08185896B2

    公开(公告)日:2012-05-22

    申请号:US11845207

    申请日:2007-08-27

    IPC分类号: G06F9/46

    CPC分类号: G06F9/5061 G06F2209/5012

    摘要: A method is provided for implementing a multi-tiered full-graph interconnect architecture. In order to implement a multi-tiered full-graph interconnect architecture, a plurality of processors are coupled to one another to create a plurality of processor books. The plurality of processor books are coupled together to create a plurality of supernodes. Then, the plurality of supernodes are coupled together to create the multi-tiered full-graph interconnect architecture. Data is then transmitted from one processor to another within the multi-tiered full-graph interconnect architecture based on an addressing scheme that specifies at least a supernode and a processor book associated with a target processor to which the data is to be transmitted.

    摘要翻译: 提供了一种实现多层全图互连架构的方法。 为了实现多层全图互连架构,多个处理器彼此耦合以创建多个处理器书籍。 多个处理器书联接在一起以创建多个超节点。 然后,将多个超节点耦合在一起以创建多层全图互连体系结构。 然后,数据在多层全图互连体系结构中从一个处理器传输到另一个处理器,这是基于一个寻址方案,该寻址方案至少指定了一个与要发送数据的目标处理器相关联的超级节点和一个处理器。