Novel snoop filter for filtering snoop requests
    2.
    发明申请
    Novel snoop filter for filtering snoop requests 有权
    用于过滤窥探请求的新型窥探过滤器

    公开(公告)号:US20060224838A1

    公开(公告)日:2006-10-05

    申请号:US11093152

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

    摘要翻译: 一种用于在具有多个处理单元的多处理器计算环境中支持高速缓存一致性的方法和装置,每个处理单元具有与其相关联并与其可操作地相连的一个或多个本地高速缓冲存储器。 该方法包括提供与每个处理单元相关联的窥探过滤器设备,每个窥探过滤器设备具有多个专用输入端口,用于从多处理器计算环境中的专用存储器写入源接收窥探请求。 每个窥探过滤器装置包括与多个专用输入端口相对应的多个并行操作端口窥探滤波器,每个端口窥探滤波器实现一个或多个并行操作子滤波器元件,其适于同时滤除从相应专用存储器接收的窥探请求 写入源并将这些请求的子集转发到其相关联的处理单元。

    Snoop filtering system in a multiprocessor system
    3.
    发明申请
    Snoop filtering system in a multiprocessor system 有权
    多处理器系统中的Snoop过滤系统

    公开(公告)号:US20060224835A1

    公开(公告)日:2006-10-05

    申请号:US11093127

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit. Each of the plurality of parallel operating filter devices comprises parallel operating sub-filter elements, each simultaneously receiving an identical snoop request and implementing one or more different snoop filter algorithms for determining those snoop requests for data that are determined not cached locally at the associated processing unit and preventing forwarding of those requests to the processor unit. In this manner, a number of snoop requests forwarded to a processing unit is reduced thereby increasing performance of the computing environment.

    摘要翻译: 一种用于在具有多个处理单元的计算环境中支持高速缓存一致性的系统和方法,每个单元具有与其可操作耦合的相关联的高速缓存存储器系统 该系统包括多个互连的窥探过滤器单元,每个窥探过滤器单元对应于相应处理单元并与其通信,每个窥探过滤器单元包括用于在计算环境中从相应存储器写入源接收异步窥探请求的多个设备 ; 以及包括用于将存储器写入源直接连接到对应的接收设备的通信链路的点对点互连; 以及与每个接收设备一一对应地耦合的多个并行操作过滤器设备,用于处理在其上接收的窥探请求,并且转发请求之一或者阻止将请求转发到其相关联的处理单元。 多个并行操作过滤器装置中的每一个包括并行操作子滤波器元件,每个并行操作子滤波器元件同时接收相同的窥探请求,并且实现一个或多个不同的窥探滤波器算法,用于确定对于在相关处理中本地未被缓存的数据被确定的窥探请求 并且防止将这些请求转发到处理器单元。 以这种方式,减少了转发到处理单元的多个窥探请求,从而增加了计算环境的性能。

    LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION
    4.
    发明申请
    LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION 失效
    低延迟存储器访问和同步

    公开(公告)号:US20070204112A1

    公开(公告)日:2007-08-30

    申请号:US11617276

    申请日:2006-12-28

    IPC分类号: G06F12/14

    摘要: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

    摘要翻译: 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。 多处理器中的每个处理器共享资源,并且每个共享资源在锁定设备内具有关联的锁,其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。 当处理器拥有与该资源相关联的锁定时,处理器仅具有访问资源的权限,并且处理器拥有锁的尝试仅需要单个加载操作,而不是传统的原子负载后跟存储,使得处理器 只执行读取操作,并且硬件锁定装置执行后续的写入操作而不是处理器。 还公开了用于非连续数据结构的简单预取。 重新定义存储器线,使得除了正常的物理存储器数据之外,每行包括足够大的指针以指向存储器中的任何其他行,其中指针用于确定要预取的存储器行而不是一些其它预测 算法。 这使得硬件能够有效地预取不连续但重复的存储器访问模式。

    Multidimensional switch network
    5.
    发明申请
    Multidimensional switch network 失效
    多维交换机网络

    公开(公告)号:US20050195808A1

    公开(公告)日:2005-09-08

    申请号:US10793068

    申请日:2004-03-04

    IPC分类号: H04L12/26

    CPC分类号: H04L49/1576 H04L45/06

    摘要: Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.

    摘要翻译: 公开了多维交换机数据网络,例如由分布式存储器并行计算机使用的,例如应用于生命科学领域的计算。 分布式存储器并行计算系统包括多个并行计算节点和将计算节点连接在一起的消息传递数据网络。 连接计算节点的数据网络包括具有N维的计算节点的多维交换机数据网络和N个维度中的每一个中的计算节点Ln的数量/数组。 每个计算节点包括具有用于N个维度中的每一个的端口的N端口路由元件。 每个N维中的Ln计算节点阵列的每个计算节点通过其路由元素的端口连接到具有Ln端口的Ln端口交叉开关。 公开了具有65,536个计算节点的四维计算系统的几个实施例。

    Methods and apparatus using commutative error detection values for fault isolation in multiple node computers
    6.
    发明申请
    Methods and apparatus using commutative error detection values for fault isolation in multiple node computers 失效
    使用多节点计算机故障隔离交换误差检测值的方法和装置

    公开(公告)号:US20060248370A1

    公开(公告)日:2006-11-02

    申请号:US11106069

    申请日:2005-04-14

    IPC分类号: G06F11/00

    CPC分类号: G06F11/1633

    摘要: The present invention concerns methods and apparatus for performing fault isolation in multiple node computing systems using commutative error detection values—for example, checksums—to identify and to isolate faulty nodes. In the present invention nodes forming the multiple node computing system are networked together and during program execution communicate with one another by transmitting information through the network. When information associated with a reproducible portion of a computer program is injected into the network by a node, a commutative error detection value is calculated and stored in commutative error detection apparatus associated with the node. At intervals, node fault detection apparatus associated with the multiple node computer system retrieve commutative error detection values saved in the commutative error detection apparatus associated with the node and stores them in memory. When the computer program is executed again by the multiple node computer system, new commutative error detection values are created; the node fault detection apparatus retrieves them and stores them in memory. The node fault detection apparatus identifies faulty nodes by comparing commutative error detection values associated with reproducible portions of the application program generated by a particular node from different runs of the application program. Differences in commutative error detection values indicate that the node may be faulty.

    摘要翻译: 本发明涉及在多节点计算系统中使用交换性错误检测值(例如校验和)识别和隔离故障节点来执行故障隔离的方法和装置。 在本发明中,形成多节点计算系统的节点被联网在一起,并且在程序执行期间通过网络传送信息彼此通信。 当与计算机程序的可再现部分相关联的信息被节点注入到网络中时,计算交换性错误检测值并将其存储在与节点相关联的交换错误检测装置中。 间歇地,与多节点计算机系统相关联的节点故障检测装置检索保存在与节点相关联的交换性错误检测装置中的交换性错误检测值,并将其存储在存储器中。 当多节点计算机系统再次执行计算机程序时,创建新的交换错误检测值; 节点故障检测装置检索它们并将其存储在存储器中。 节点故障检测装置通过比较与来自应用程序的不同运行的特定节点生成的应用程序的可再现部分相关联的交换错误检测值来识别故障节点。 交换性错误检测值的差异表明节点可能有故障。

    Deterministic error recovery protocol
    7.
    发明申请
    Deterministic error recovery protocol 失效
    确定性错误恢复协议

    公开(公告)号:US20050081078A1

    公开(公告)日:2005-04-14

    申请号:US10674952

    申请日:2003-09-30

    摘要: Disclosed are an error recovery method and system for use with a communication system having first and second nodes, each of said nodes having a receiver and a sender, the sender of the first node being connected to the receiver of the second node by a first cable, and the sender of the second node being connected to the receiver of the first node by a second cable. The method comprising the step of after one of the nodes detects an error, both of the nodes entering the same defined state. In particular, the receiver of the first node enters an error state, stays in the error state for a defined period of time T, and, after said defined period of time T, enters a wait state. Also, the sender of the first node sends to the receiver of the second node an error message for a defined period of time Te, and after the defined period of time Te, the sender of the first node enters an idle state.

    摘要翻译: 公开了一种用于与具有第一和第二节点的通信系统一起使用的错误恢复方法和系统,每个所述节点具有接收器和发送器,第一节点的发送器通过第一电缆连接到第二节点的接收器 并且第二节点的发送者通过第二电缆连接到第一节点的接收器。 所述方法包括在所述节点中的一个检测到错误之后的两个节点进入相同的定义状态的步骤。 特别地,第一节点的接收机进入错误状态,在定义的时间段T内保持在错误状态,并且在所述定义的时间段T之后进入等待状态。 此外,第一节点的发送方在给定的时间段Te的情况下向第二节点的接收者发送错误消息,并且在定义的时间段Te之后,第一节点的发送者进入空闲状态。

    Method and apparatus for detecting a cache wrap condition
    8.
    发明申请
    Method and apparatus for detecting a cache wrap condition 有权
    用于检测缓存包装条件的方法和装置

    公开(公告)号:US20060230239A1

    公开(公告)日:2006-10-12

    申请号:US11093132

    申请日:2005-03-29

    IPC分类号: G06F13/28

    CPC分类号: G06F12/0822 G06F12/0831

    摘要: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.

    摘要翻译: 一种用于在具有处理器和高速缓存的计算环境中检测高速缓存包装条件的方法和装置。 当高速缓存的全部内容相对于特定的启动状态被替换时,检测到缓存包装条件。 当缓存中的所有集合已被替换时,集合关联缓存被认为已被包装。 高速缓存包检测的起始点是先前高速缓存包装时高速缓存集的状态。 该方法和装置优选地在具有依赖于检测高速缓存包装条件的过滤机构的窥探过滤器中实现。 这些需要该信息的窥探过滤机构可操作地与适用于检测高速缓存包裹事件的高速缓存包检测逻辑耦合,并且向窥探过滤机构执行指示步骤。 在各种实施例中,使用寄存器和比较器,可加载计数器或记分板数据结构来实现高速缓存封包检测逻辑。

    Extended write combining using a write continuation hint flag
    9.
    发明授权
    Extended write combining using a write continuation hint flag 失效
    使用写入连续提示标志进行扩展写入组合

    公开(公告)号:US08458282B2

    公开(公告)日:2013-06-04

    申请号:US11768593

    申请日:2007-06-26

    摘要: A computing apparatus for reducing the amount of processing in a network computing system which includes a network system device of a receiving node for receiving electronic messages comprising data. The electronic messages are transmitted from a sending node. The network system device determines when more data of a specific electronic message is being transmitted. A memory device stores the electronic message data and communicating with the network system device. A memory subsystem communicates with the memory device. The memory subsystem stores a portion of the electronic message when more data of the specific message will be received, and the buffer combines the portion with later received data and moves the data to the memory device for accessible storage.

    摘要翻译: 一种用于减少网络计算系统中的处理量的计算装置,其包括用于接收包括数据的电子消息的接收节点的网络系统设备。 从发送节点发送电子消息。 网络系统设备确定何时正在发送特定电子消息的更多数据。 存储装置存储电子消息数据并与网络系统装置进行通信。 存储器子系统与存储器件通信。 当更多的特定消息的数据将被接收时,存储器子系统存储电子消息的一部分,并且缓冲器将该部分与稍后接收的数据组合,并将数据移动到存储器装置以进行存取。

    EXTENDED WRITE COMBINING USING A WRITE CONTINUATION HINT FLAG
    10.
    发明申请
    EXTENDED WRITE COMBINING USING A WRITE CONTINUATION HINT FLAG 失效
    使用写持续提示标签扩展写入组合

    公开(公告)号:US20090006605A1

    公开(公告)日:2009-01-01

    申请号:US11768593

    申请日:2007-06-26

    IPC分类号: G06F17/30 G06F15/173

    摘要: A computing apparatus for reducing the amount of processing in a network computing system which includes a network system device of a receiving node for receiving electronic messages comprising data. The electronic messages are transmitted from a sending node. The network system device determines when more data of a specific electronic message is being transmitted. A memory device stores the electronic message data and communicating with the network system device. A memory subsystem communicates with the memory device. The memory subsystem stores a portion of the electronic message when more data of the specific message will be received, and the buffer combines the portion with later received data and moves the data to the memory device for accessible storage.

    摘要翻译: 一种用于减少网络计算系统中的处理量的计算装置,其包括用于接收包括数据的电子消息的接收节点的网络系统设备。 从发送节点发送电子消息。 网络系统设备确定何时正在发送特定电子消息的更多数据。 存储装置存储电子消息数据并与网络系统装置进行通信。 存储器子系统与存储器件通信。 当更多的特定消息的数据将被接收时,存储器子系统存储电子消息的一部分,并且缓冲器将该部分与稍后接收的数据组合,并将数据移动到存储器装置以进行存取。