Method and apparatus for equalizing a bandwidth impedance mismatch between a client and an interface
    11.
    发明授权
    Method and apparatus for equalizing a bandwidth impedance mismatch between a client and an interface 有权
    用于均衡客户端和接口之间的带宽阻抗不匹配的方法和装置

    公开(公告)号:US08683089B1

    公开(公告)日:2014-03-25

    申请号:US12650371

    申请日:2009-12-30

    IPC分类号: G06F3/00 G06F13/00

    摘要: One or more client engines issues write transactions to system memory or peer parallel processor (PP) memory across a peripheral component interconnect express (PCIe) interface. The client engines may issue write transactions faster than the PCIe interface can transport those transactions, causing write transactions to accumulate within the PCIe interface. To prevent the accumulation of write transactions within the PCIe interface, an arbiter throttles write transactions received from the client engines based on the number of write transactions currently being transported across the PCIe interface.

    摘要翻译: 一个或多个客户端引擎通过外围组件互连快速(PCIe)接口向系统内存或对等并行处理器(PP)存储器发出写入事务。 客户端引擎可能比PCIe接口可以传输这些事务更快地发出写事务,导致写事务在PCIe接口内累积。 为了防止PCIe接口内写入事务的累积,仲裁器基于当前正在PCIe接口上传输的写入事务的数量来限制从客户机引擎接收的写入事务。

    COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION
    12.
    发明申请
    COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION 审中-公开
    计算机螺旋桨阵列精度执行预警

    公开(公告)号:US20130132711A1

    公开(公告)日:2013-05-23

    申请号:US13302962

    申请日:2011-11-22

    IPC分类号: G06F9/38

    CPC分类号: G06F9/461

    摘要: One embodiment of the present invention sets forth a technique instruction level and compute thread array granularity execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. When preemption is performed at a compute thread array boundary, the amount of context state to be stored is reduced because execution units within the processing pipeline complete execution of in-flight instructions and become idle. If, the amount of time needed to complete execution of the in-flight instructions exceeds a threshold, then the preemption may dynamically change to be performed at the instruction level instead of at compute thread array granularity.

    摘要翻译: 本发明的一个实施例阐述了技术指令级别和计算线程数组粒度执行抢占。 在指令级别抢占不需要处理管道的任何排水。 不会发出新的指令,并且从处理流水线中卸载上下文状态。 当在计算线程数组边界执行抢占时,由于处理流程内的执行单元完成飞行中指令的执行并变为空闲,因此减少了要存储的上下文状态量。 如果完成执行飞行中指令所需的时间超过阈值,则抢占可以动态地改变以在指令级别而不是以计算线程数组粒度来执行。

    Shadow unit for shadowing circuit status
    13.
    发明授权
    Shadow unit for shadowing circuit status 有权
    阴影单元用于阴影电路状态

    公开(公告)号:US07937606B1

    公开(公告)日:2011-05-03

    申请号:US11437112

    申请日:2006-05-18

    IPC分类号: G06F1/04 G06F15/177

    摘要: Generally, the present disclosure concerns systems and methods for shadowing status for a circuit with a shadow unit. In one aspect, a system comprises a first circuit in a first dynamic clock domain of a plurality of dynamic clock domains, a processor configured to execute software instructions to generate a request for a status of the first circuit, and a second circuit coupled to the first circuit and to the processor. The second circuit, outside the first dynamic clock domain, is configured to shadow a status of the first circuit and to respond to the request for the status of the first circuit with the shadowed status.

    摘要翻译: 通常,本公开涉及用于具有阴影单元的电路的阴影状态的系统和方法。 在一个方面,系统包括多个动态时钟域的第一动态时钟域中的第一电路,被配置为执行软件指令以产生对第一电路的状态的请求的处理器,以及耦合到 第一电路和处理器。 在第一动态时钟域之外的第二电路被配置为影响第一电路的状态并且响应于具有阴影状态的第一电路的状态的请求。

    Block data mover adapted to contain faults in a partitioned multiprocessor system
    14.
    发明授权
    Block data mover adapted to contain faults in a partitioned multiprocessor system 有权
    阻止数据移动器适于在分区多处理器系统中包含故障

    公开(公告)号:US06826653B2

    公开(公告)日:2004-11-30

    申请号:US10068427

    申请日:2002-02-06

    IPC分类号: G06F1200

    CPC分类号: G06F12/0817 G06F2212/621

    摘要: A system and method are provided for moving information between cache coherent memory systems of a partitioned multiprocessor computer system while containing faults to a single partition. The multiprocessor computer system includes a plurality of processors, memory subsystems and input/output (I/O) subsystems that can be divided into a plurality of partitions. Each I/O subsystem includes at least one I/O bridge for interfacing between one or more I/O devices and the multiprocessor system. The I/O bridge has a data mover configured to retrieve information from a “source” partition and to store that information within its own “destination” partition. When activated, the data mover issues a request to the source partition for a non-coherent copy of the information. The home memory subsystem in the source partition preferably responds to the request by sending the data mover “valid”, but non-coherent copy of the information, e.g., a “snapshot” of the information as of the time of the request. Upon receiving the information, the data mover may copy it into the memory subsystem of the destination partition.

    摘要翻译: 提供了一种系统和方法,用于在分区多处理器计算机系统的高速缓存一致存储器系统之间移动信息,同时将故障包含在单个分区中。 多处理器计算机系统包括可以被分成多个分区的多个处理器,存储器子系统和输入/输出(I / O)子系统。 每个I / O子系统包括用于在一个或多个I / O设备和多处理器系统之间进行接口的至少一个I / O桥。 I / O网桥具有数据移动设备,用于从“源”分区检索信息,并将该信息存储在其自己的“目标”分区中。 当激活时,数据移动器向源分区发出请求,以获得不相干的信息副本。 源分区中的家用存储器子系统优选地通过发送数据移动器“有效”但是信息的非相干副本(例如,请求的时间的信息的“快照”)来响应该请求。 一旦接收到该信息,数据移动器可将其复制到目的地分区的存储子系统中。

    Anti-starvation interrupt protocol
    16.
    发明授权
    Anti-starvation interrupt protocol 有权
    抗饥饿中断协议

    公开(公告)号:US06920516B2

    公开(公告)日:2005-07-19

    申请号:US09944516

    申请日:2001-08-31

    IPC分类号: G06F13/24 G06F13/40 H03K5/19

    摘要: An anti-starvation interrupt protocol for use in avoiding livelock in a multiprocessor computer system is provided. At least one processor is configured to include first and second control status registers (CSRs). The first CSR buffers information, such as interrupts, received by the processor, while the second CSR keeps track of the priority level of the interrupts. When an interrupt controller receives an interrupt, it issues a write transaction to the first CSR at the processor. If the first CSR has room to accept the write transaction, the processor returns an acknowledgement, whereas if the first CSR is already full, the processor returns a no acknowledgment. In response to a no acknowledgment, the interrupt controller increments an interrupt starvation counter, and checks to see whether the counter exceeds a threshold. If not, the interrupt controller waits a preset time and reposts the write transaction. If it does, the interrupt controller issues a write transaction having a higher priority to the second CSR. In response, the processor copies all of the pending interrupts from the first CSR into the memory subsystem, thereby freeing up the first CSR to accept additional write transactions.

    摘要翻译: 提供了一种用于避免多处理器计算机系统中的活动锁定的反饥饿中断协议。 至少一个处理器被配置为包括第一和第二控制状态寄存器(CSR)。 第一个CSR缓冲器由处理器接收到的中断信息,而第二个CSR跟踪中断的优先级。 当中断控制器接收到中断时,它会在处理器发出写入事务给第一个CSR。 如果第一个CSR有空间接受写入事务,则处理器返回确认,而如果第一个CSR已经满,则处理器返回一个无确认。 响应于无应答,中断控制器增加中断饥饿计数器,并检查计数器是否超过阈值。 如果没有,则中断控制器等待预设时间并转发写入事务。 如果是这样,中断控制器向第二个CSR发出具有较高优先级的写入事务。 作为响应,处理器将来自第一CSR的所有待处理中断复制到存储器子系统中,从而释放第一个CSR以接受额外的写事务。

    System and method for providing forward progress and avoiding starvation and livelock in a multiprocessor computer system
    17.
    发明授权
    System and method for providing forward progress and avoiding starvation and livelock in a multiprocessor computer system 失效
    在多处理器计算机系统中提供前进进步和避免饥饿和活动锁定的系统和方法

    公开(公告)号:US06647453B1

    公开(公告)日:2003-11-11

    申请号:US09652984

    申请日:2000-08-31

    IPC分类号: G06F1336

    CPC分类号: G06F12/0835

    摘要: A system and method avoids “livelock” and “starvation” among two or more input/output (I/O) devices of a symmetrical multiprocessor (SMP) computer system competing for the same data. The SMP computer system includes a plurality of interconnected processors, one or more memories that are shared by the processors, and a plurality of I/O bridges to which the I/O devices are coupled. A cache coherency protocol is executed the I/O bridges, which requires the I/O bridges to obtain “exclusive” (not shared) ownership of all data stored by the bridges. In response to a request for data currently stored by an I/O bridge, the bridge first copies at least a portion of that data to a non-coherent buffer before invalidating the data. The bridge then takes the largest amount of the data saved in its non-coherent buffer that its knows to be coherent, and releases only that known coherent amount to the I/O device, and then discards all of the saved data.

    摘要翻译: 一种系统和方法避免了竞争相同数据的对称多处理器(SMP)计算机系统的两个或多个输入/输出(I / O)设备中的“活动锁定”和“饥饿”。 SMP计算机系统包括多个互连处理器,由处理器共享的一个或多个存储器以及I / O设备耦合到的多个I / O桥。 执行缓存一致性协议的I / O网桥,这需要I / O网桥获得由桥接器存储的所有数据的“独占”(非共享)所有权。 响应于对I / O桥当前存储的数据的请求,桥接器首先在将数据无效之前将该数据的至少一部分复制到非相干缓冲器。 然后,桥接器保存在其非相干缓冲器中的最大数量的数据,其被认为是相干的,并且仅将已知的相干量释放到I / O设备,然后丢弃所有保存的数据。

    Fast and highly scalable quota-based weighted arbitration
    18.
    发明授权
    Fast and highly scalable quota-based weighted arbitration 有权
    快速且高度可扩展的配额加权仲裁

    公开(公告)号:US08667200B1

    公开(公告)日:2014-03-04

    申请号:US12712109

    申请日:2010-02-24

    IPC分类号: G06F12/00 H04L12/28

    摘要: One embodiment of the present invention sets forth a technique for arbitrating between a set of requesters that transmit data transmission requests to the weighted LRU arbiter. Each data transmission request is associated with a specific amount of data to be transmitted over the crossbar unit. Based on the priority state associated with each requester, the weighted LRU arbiter then selects the requester in the set of requesters with the highest priority. The weighted LRU arbiter then decrements the weight associated with the selected requester stored in a corresponding weight store based on the size of the data to be transmitted. If the decremented weight is equal to or less than zero, then the priority associated with the selected requester is set to a lowest priority. If, however, the decremented weight is greater than zero, then the priority associated with the selected requester is not changed.

    摘要翻译: 本发明的一个实施例提出了一种用于在向加权的LRU仲裁器发送数据传输请求的请求者组之间进行仲裁的技术。 每个数据传输请求与通过交叉开关单元传输的特定数据量相关联。 基于与每个请求者相关联的优先级状态,加权LRU仲裁器然后选择具有最高优先级的请求者组中的请求者。 然后,加权的LRU仲裁器基于要发送的数据的大小来减去与存储在对应权重存储器中的所选择的请求者相关联的权重。 如果递减权重等于或小于零,则与所选择的请求者相关联的优先级被设置为最低优先级。 然而,如果递减的权重大于零,则与所选请求者相关联的优先级不改变。

    Scalable efficient I/O port protocol
    19.
    发明授权
    Scalable efficient I/O port protocol 有权
    可扩展的高效I / O端口协议

    公开(公告)号:US08364851B2

    公开(公告)日:2013-01-29

    申请号:US10677583

    申请日:2003-10-02

    IPC分类号: G06F3/00

    摘要: A system that supports a high performance, scalable, and efficient I/O port protocol to connect to I/O devices is disclosed. A distributed multiprocessing computer system contains a number of processors each coupled to an I/O bridge ASIC implementing the I/O port protocol. One or more I/O devices are coupled to the I/O bridge ASIC, each I/O device capable of accessing machine resources in the computer system by transmitting and receiving message packets. Machine resources in the computer system include data blocks, registers and interrupt queues. Each processor in the computer system is coupled to a memory module capable of storing data blocks shared between the processors. Coherence of the shared data blocks in this shared memory system is maintained using a directory based coherence protocol. Coherence of data blocks transferred during I/O device read and write accesses is maintained using the same coherence protocol as for the memory system. Data blocks transferred during an I/O device read or write access may be buffered in a cache by the I/O bridge ASIC only if the I/O bridge ASIC has exclusive copies of the data blocks. The I/O bridge ASIC includes a DMA device that supports both in-order and out-of-order DMA read and write streams of data blocks. An in-order stream of reads of data blocks performed by the DMA device always results in the DMA device receiving coherent data blocks that do not have to be written back to the memory module.

    摘要翻译: 公开了一种支持高性能,可扩展和高效的I / O端口协议来连接到I / O设备的系统。 分布式多处理计算机系统包含多个处理器,每个处理器都耦合到实现I / O端口协议的I / O桥ASIC。 一个或多个I / O设备耦合到I / O桥ASIC,每个I / O设备能够通过发送和接收消息分组来访问计算机系统中的机器资源。 计算机系统中的机器资源包括数据块,寄存器和中断队列。 计算机系统中的每个处理器耦合到能够存储处理器之间共享的数据块的存储器模块。 使用基于目录的一致性协议来维护该共享存储器系统中的共享数据块的一致性。 使用与存储系统相同的一致性协议来维护I / O设备读写访问期间传输的数据块的一致性。 只有当I / O桥ASIC具有数据块的排他副本时,I / O桥ASIC才能缓存在I / O设备读或写访问期间传输的数据块。 I / O桥ASIC包括支持数据块的顺序和无序DMA读和写数据流的DMA设备。 由DMA设备执行的数据块的顺序读取流总是导致DMA设备接收不必写入存储器模块的相干数据块。