Techniques for Data Prefetching Using Indirect Addressing with Offset
    121.
    发明申请
    Techniques for Data Prefetching Using Indirect Addressing with Offset 有权
    使用偏移量进行间接寻址的数据预取技术

    公开(公告)号:US20090198904A1

    公开(公告)日:2009-08-06

    申请号:US12024246

    申请日:2008-02-01

    IPC分类号: G06F12/08

    摘要: A technique for performing data prefetching using indirect addressing includes determining a first memory address of a pointer associated with a data prefetch instruction. Content, that is included in a first data block (e.g., a first cache line) of a memory, at the first memory address is then fetched. An offset is then added to the content of the memory at the first memory address to provide a first offset memory address. A second memory address is then determined based on the first offset memory address. A second data block (e.g., a second cache line) that includes data at the second memory address is then fetched (e.g., from the memory or another memory). A data prefetch instruction may be indicated by a unique operational code (opcode), a unique extended opcode, or a field (including one or more bits) in an instruction.

    摘要翻译: 使用间接寻址执行数据预取的技术包括确定与数据预取指令相关联的指针的第一存储器地址。 然后取出包含在第一存储器地址的存储器的第一数据块(例如,第一高速缓存行)中的内容。 然后将偏移量添加到第一存储器地址处的存储器的内容以提供第一偏移存储器地址。 然后基于第一偏移存储器地址确定第二存储器地址。 包括第二存储器地址上的数据的第二数据块(例如,第二高速缓存行)然后被取出(例如,从存储器或另一个存储器)。 数据预取指令可以由指令中的唯一操作代码(操作码),唯一扩展操作码或字段(包括一个或多个位)来指示。

    System and Method to Use Cache that is Embedded in a Memory Hub to Replace Failed Memory Cells in a Memory Subsystem
    122.
    发明申请
    System and Method to Use Cache that is Embedded in a Memory Hub to Replace Failed Memory Cells in a Memory Subsystem 有权
    使用嵌入在内存中心中的缓存来替换内存子系统中的故障内存单元的系统和方法

    公开(公告)号:US20090193290A1

    公开(公告)日:2009-07-30

    申请号:US12019141

    申请日:2008-01-24

    IPC分类号: G06F11/20

    摘要: A memory system, data processing system, and method are provided for using cache that is embedded in a memory hub device to replace failed memory cells. A memory module comprises an integrated memory hub device. The memory hub device comprises an integrated memory device data interface that communicates with a set of memory devices coupled to the memory hub device and a cache integrated in the memory hub device. The memory hub device also comprises an integrated memory hub controller that controls the data that is read or written by the memory device data interface to the cache based on a determination whether one or more memory cells within the set of memory devices has failed.

    摘要翻译: 提供了一种存储系统,数据处理系统和方法,用于使用嵌入在存储器集线器设备中的高速缓存来代替故障存储器单元。 存储器模块包括集成存储器集线器设备。 存储器集线器设备包括与耦合到存储器集线器设备的一组存储器设备和集成在存储器集线器设备中的高速缓存器通信的集成存储器设备数据接口。 存储器集线器设备还包括集成存储器集线器控制器,其基于确定存储器装置集合内的一个或多个存储器单元是否已经失败来控制由存储器件数据接口读取或写入高速缓存的数据。

    Method for Providing a Cluster-Wide System Clock in a Multi-Tiered Full-Graph Interconnect Architecture
    123.
    发明申请
    Method for Providing a Cluster-Wide System Clock in a Multi-Tiered Full-Graph Interconnect Architecture 有权
    在多层全图互连架构中提供集群宽系统时钟的方法

    公开(公告)号:US20090070617A1

    公开(公告)日:2009-03-12

    申请号:US11853522

    申请日:2007-09-11

    IPC分类号: G06F1/12

    CPC分类号: G06F1/10 G06F1/12

    摘要: A method for providing a cluster-wide system clock in a multi-tiered full graph (MTFG) interconnect architecture are provided. Heartbeat signals transmitted by each of the processor chips in the computing cluster are synchronized. Internal system clock signals are generated in each of the processor chips based on the synchronized heartbeat signals. As a result, the internal system clock signals of each of the processor chips are synchronized since the heartbeat signals, that are the basis for the internal system clock signals, are synchronized. Mechanisms are provided for performing such synchronization using direct couplings of processor chips within the same processor book, different processor books in the same supernode, and different processor books in different supernodes of the MTFG interconnect architecture.

    摘要翻译: 提供了一种在多层全图(MTFG)互连架构中提供集群范围的系统时钟的方法。 计算群集中的每个处理器芯片发送的心跳信号同步。 基于同步的心跳信号,在每个处理器芯片中产生内部系统时钟信号。 结果,每个处理器芯片的内部系统时钟信号被同步,因为作为内部系统时钟信号的基础的心跳信号被同步。 提供了用于使用同一处理器书中的处理器芯片的直接耦合,同一超级节点中的不同处理器书以及MTFG互连体系结构的不同超节点中的不同处理器簿来执行这种同步的机制。

    System and Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks
    124.
    发明申请
    System and Method for Hardware Based Dynamic Load Balancing of Message Passing Interface Tasks 审中-公开
    基于硬件的消息传递接口任务的动态负载平衡的系统和方法

    公开(公告)号:US20090064166A1

    公开(公告)日:2009-03-05

    申请号:US11846141

    申请日:2007-08-28

    IPC分类号: G06F9/46

    CPC分类号: G06F9/5083 G06F9/522

    摘要: A system and method for providing hardware based dynamic load balancing of message passing interface (MPI) tasks are provided. Mechanisms for adjusting the balance of processing workloads of the processors executing tasks of an MPI job are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. Each processor has an associated hardware implemented MPI load balancing controller. The MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, operations may be performed to shift workloads from the slowest processor to one or more of the faster processors.

    摘要翻译: 提供了一种用于提供消息传递接口(MPI)任务的基于硬件的动态负载平衡的系统和方法。 提供了用于调整执行MPI作业任务的处理器的处理工作负载的平衡的机制,以便最小化等待所有处理器调用同步操作的等待时间。 每个处理器都有一个相关的硬件实现的MPI负载平衡控制器。 MPI负载平衡控制器维护一个历史记录,提供任务关于其对同步操作的调用的简档。 根据该信息,可以确定哪些处理器应该减轻其处理负载,哪些处理器能够处理额外的处理负载,而不会对并行执行系统的整体操作产生显着的负面影响。 因此,可以执行操作以将工作负载从最慢的处理器转移到一个或多个更快的处理器。

    Method for Data Processing Using a Multi-Tiered Full-Graph Interconnect Architecture
    125.
    发明申请
    Method for Data Processing Using a Multi-Tiered Full-Graph Interconnect Architecture 失效
    使用多层全图互连架构的数据处理方法

    公开(公告)号:US20090064139A1

    公开(公告)日:2009-03-05

    申请号:US11845207

    申请日:2007-08-27

    IPC分类号: G06F9/46

    CPC分类号: G06F9/5061 G06F2209/5012

    摘要: A method is provided for implementing a multi-tiered full-graph interconnect architecture. In order to implement a multi-tiered full-graph interconnect architecture, a plurality of processors are coupled to one another to create a plurality of processor books. The plurality of processor books are coupled together to create a plurality of supernodes. Then, the plurality of supernodes are coupled together to create the multi-tiered full-graph interconnect architecture. Data is then transmitted from one processor to another within the multi-tiered full-graph interconnect architecture based on an addressing scheme that specifies at least a supernode and a processor book associated with a target processor to which the data is to be transmitted.

    摘要翻译: 提供了一种实现多层全图互连架构的方法。 为了实现多层全图互连架构,多个处理器彼此耦合以创建多个处理器书籍。 多个处理器书联接在一起以创建多个超节点。 然后,将多个超节点耦合在一起以创建多层全图互连体系结构。 然后,数据在多层全图互连体系结构中从一个处理器传输到另一个处理器,这是基于一个寻址方案,该寻址方案至少指定了一个与要发送数据的目标处理器相关联的超级节点和一个处理器。

    System for Data Processing Using a Multi-Tiered Full-Graph Interconnect Architecture
    126.
    发明申请
    System for Data Processing Using a Multi-Tiered Full-Graph Interconnect Architecture 失效
    使用多层全图互连架构进行数据处理的系统

    公开(公告)号:US20090063811A1

    公开(公告)日:2009-03-05

    申请号:US11845206

    申请日:2007-08-27

    IPC分类号: G06F15/80

    CPC分类号: G06F15/16

    摘要: A system is provided for implementing a multi-tiered full-graph interconnect architecture. In order to implement a multi-tiered full-graph interconnect architecture, a plurality of processors are coupled to one another to create a plurality of processor books. The plurality of processor books are coupled together to create a plurality of supernodes. Then, the plurality of supernodes are coupled together to create the multi-tiered full-graph interconnect architecture. Data is then transmitted from one processor to another within the multi-tiered full-graph interconnect architecture based on an addressing scheme that specifies at least a supernode and a processor book associated with a target processor to which the data is to be transmitted.

    摘要翻译: 提供了一种用于实现多层全图互连体系结构的系统。 为了实现多层全图互连架构,多个处理器彼此耦合以创建多个处理器书籍。 多个处理器书联接在一起以创建多个超节点。 然后,将多个超节点耦合在一起以创建多层全图互连体系结构。 然后,数据在多层全图互连体系结构中从一个处理器传输到另一个处理器,这是基于一个寻址方案,该寻址方案至少指定了一个与要发送数据的目标处理器相关联的超级节点和一个处理器。

    System and Method for Handling Indirect Routing of Information Between Supernodes of a Multi-Tiered Full-Graph Interconnect Architecture
    127.
    发明申请
    System and Method for Handling Indirect Routing of Information Between Supernodes of a Multi-Tiered Full-Graph Interconnect Architecture 有权
    用于处理多层全图互连架构超前信息间接路由的系统和方法

    公开(公告)号:US20090063445A1

    公开(公告)日:2009-03-05

    申请号:US11845221

    申请日:2007-08-27

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: H04L45/00 H04L45/22

    摘要: A method, computer program product, and system are provided for selecting, from a plurality of routes through the data processing system, an indirect route for transmitting data. Data that includes address information is received at a first processor that is to be transmitted to a destination processor. Using routing table data structures, indirect route entries are identified that correspond to indirect routes for transmitting data. An accessed priority table data structure comprises a priority entry for each entry in the routing table data structures. The priority entry specifies a priority of a corresponding entry in the routing table data structures. An indirect route entry is selected that corresponds to an indirect route from the routing table data structures, based on specified priorities. Then the data is transmitted from the first processor to the destination processor using a path corresponding to the selected indirect route entry.

    摘要翻译: 提供了一种方法,计算机程序产品和系统,用于从通过数据处理系统的多条路线中选择用于发送数据的间接路由。 在要发送到目的地处理器的第一处理器处接收包括地址信息的数据。 使用路由表数据结构,识别对应于用于传输数据的间接路由的间接路由条目。 访问的优先级表数据结构包括路由表数据结构中的每个条目的优先级项。 优先级条目指定路由表数据结构中相应条目的优先级。 基于指定的优先级,选择对应于来自路由表数据结构的间接路由的间接路由条目。 然后使用对应于所选择的间接路由条目的路径从第一处理器将数据发送到目的地处理器。

    System and Method for Providing Multiple Redundant Direct Routes Between Supernodes of a Multi-Tiered Full-Graph Interconnect Architecture
    128.
    发明申请
    System and Method for Providing Multiple Redundant Direct Routes Between Supernodes of a Multi-Tiered Full-Graph Interconnect Architecture 有权
    在多层全图互连架构的超新星之间提供多个冗余直接路由的系统和方法

    公开(公告)号:US20090063444A1

    公开(公告)日:2009-03-05

    申请号:US11845217

    申请日:2007-08-27

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: G06F15/17381 H04L67/327

    摘要: A method, computer program product, and system are provided for selecting, from a plurality of routes through the data processing system, a direct route for transmitting data. Data that includes address information is received at a first processor that is to be transmitted to a destination processor. Using routing table data structures, direct route entries are identified that correspond to direct routes for transmitting data. An accessed priority table data structure comprises a priority entry for each entry in the routing table data structures. The priority entry specifies a priority of a corresponding entry in the routing table data structures. A direct route entry is selected that corresponds to a direct route from the routing table data structures, based on specified priorities. Then the data is transmitted from the first processor to the destination processor using a path corresponding to the selected direct route entry.

    摘要翻译: 提供一种方法,计算机程序产品和系统,用于从数据处理系统的多条路线中选择用于发送数据的直接路由。 在要发送到目的地处理器的第一处理器处接收包括地址信息的数据。 使用路由表数据结构,识别与发送数据的直接路由相对应的直接路由条目。 访问的优先级表数据结构包括路由表数据结构中的每个条目的优先级项。 优先级条目指定路由表数据结构中相应条目的优先级。 基于指定的优先级,从路由表数据结构中选择对应于直接路由的直接路由条目。 然后使用与所选择的直接路由条目相对应的路径,将数据从第一处理器发送到目的地处理器。

    Intelligent cache management mechanism via processor access sequence analysis
    129.
    发明授权
    Intelligent cache management mechanism via processor access sequence analysis 失效
    智能缓存管理机制通过处理器访问序列分析

    公开(公告)号:US06629210B1

    公开(公告)日:2003-09-30

    申请号:US09696888

    申请日:2000-10-26

    IPC分类号: G06F1208

    CPC分类号: G06F12/121 G06F12/0815

    摘要: In addition to an address tag, a coherency state and an LRU position, each cache directory entry includes historical processor access information for the corresponding cache line. The historical processor access information includes different subentries for each different processor which has accessed the corresponding cache line, with subentries being “pushed” along the stack when a new processor accesses the subject cache line. Each subentries contains the processor identifier for the corresponding processor which accessed the cache line, one or more opcodes identifying the operations which were performed by the processor, and timestamps associated with each opcode. This historical processor access information may then be utilized by the cache controller to influence victim selection, coherency state transitions, LRU state transitions, deallocation timing, and other cache management functions so that smaller caches are given the effectiveness of very large caches through more intelligent cache management.

    摘要翻译: 除了地址标签,一致性状态和LRU位置之外,每个高速缓存目录条目包括对应的高速缓存行的历史处理器访问信息。 历史处理器访问信息包括已经访问相应的高速缓存行的每个不同处理器的不同子条目,当新处理器访问对象高速缓存行时,子条目沿“栈”被“推送”。 每个子条目包含访问高速缓存行的相应处理器的处理器标识符,标识由处理器执行的操作的一个或多个操作码以及与每个操作码相关联的时间戳。 然后,该历史处理器访问信息可以由高速缓存控制器利用来影响受害者选择,一致性状态转换,LRU状态转换,解除分配定时和其他高速缓存管理功能,使得通过更智能高速缓存向更小的高速缓存提供非常大的高速缓存的有效性 管理。

    Digital clock pulse positioning circuit for delaying a signal input by a
fist time duration and a second time duration to provide a positioned
clock signal
    130.
    发明授权
    Digital clock pulse positioning circuit for delaying a signal input by a fist time duration and a second time duration to provide a positioned clock signal 失效
    数字时钟脉冲定位电路,用于延迟第一持续时间和第二持续时间的信号输入以提供定位的时钟信号

    公开(公告)号:US5548797A

    公开(公告)日:1996-08-20

    申请号:US316976

    申请日:1994-10-03

    CPC分类号: G06F9/3869 H03K5/15026

    摘要: An input/output channel controller includes a storage array for temporarily storing data and multiple clocks to access or update the data. One or more array clock signals are generated from a system clock combined with other clock signals to generate a single clock signal which is positioned in time by a clock positioning circuit to accommodate circuit throughput delay variations and to effectively reduce hold time to zero. Storage arrays may be clocked at significantly higher frequencies and arrays may have multiple gated clocks without incurring the hold time problems.

    摘要翻译: 输入/输出通道控制器包括用于临时存储数据的存储阵列和用于访问或更新数据的多个时钟。 一个或多个阵列时钟信号从与其它时钟信号组合的系统时钟产生,以产生单个时钟信号,时钟信号由时钟定位电路定时,以适应电路吞吐量延迟变化并有效地将保持时间减少到零。 存储阵列可以以显着更高的频率进行计时,并且阵列可以具有多个门控时钟,而不会导致保持时间问题。