Method and apparatus for filtering snoop requests using multiple snoop caches

    公开(公告)号:US20060224839A1

    公开(公告)日:2006-10-05

    申请号:US11093154

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: A method and apparatus for implementing a snoop filter unit associated with a single processor in a multiprocessor system. The snoop filter unit has a plurality of ports, each port receiving snoop requests from exactly one dedicated source. Associated with each port is a snoop cache filter that processes each snoop cache request and records addresses of the most recent snoop requests for exactly one source. The snoop cache filter uses vector encoding to record the occurrence of snoop requests for a sequence of consecutive cache lines. All addresses of snoop requests are added to the snoop cache unless a received snoop cache request matches an entry present in the associated snoop cache, in which case the snoop request is discarded. Otherwise, the associated snoop cache request is enqueued for forwarding to the single processor. Information from all snoop cache filters assigned to all ports in the snoop filter unit are removed in the case that data corresponding to any one of the memory addresses contained in snoop cache filter is loaded in the cache hierarchy of the processor the snoop cache filter is assigned to.

    Method and apparatus for filtering snoop requests using stream registers
    3.
    发明申请
    Method and apparatus for filtering snoop requests using stream registers 有权
    使用流寄存器对窥探请求进行过滤的方法和装置

    公开(公告)号:US20060224836A1

    公开(公告)日:2006-10-05

    申请号:US11093130

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having a local cache memory associated therewith. A snoop filter device is associated with each processing unit and includes at least one snoop filter primitive implementing filtering method based on usage of stream registers sets and associated stream register comparison logic. From the plurality of stream registers sets, at least one stream register set is active, and at least one stream register set is labeled historic at any point in time. In addition, the snoop filter block is operatively coupled with cache wrap detection logic whereby the content of the active stream register set is switched into a historic stream register set upon the cache wrap condition detection, and the content of at least one active stream register set is reset. Each filter primitive implements stream register comparison logic that determines whether a received snoop request is to be forwarded to the processor or discarded.

    摘要翻译: 一种用于在具有多个处理单元的多处理器计算环境中支持高速缓存一致性的方法和装置,每个处理单元具有与其相关联的本地高速缓冲存储器。 窥探过滤设备与每个处理单元相关联并且包括至少一个基于流寄存器集合和相关流寄存器比较逻辑的使用实现过滤方法的窥探过滤器原语。 从多个流寄存器组中,至少一个流寄存器组是有效的,并且至少一个流寄存器集合在任何时间点被标记为历史。 另外,监听滤波器块可操作地与高速缓存包检测逻辑耦合,从而将活动流寄存器集合的内容切换到在高速缓存环绕条件检测时设置的历史流寄存器,并且至少一个活动流寄存器集合的内容 被复位。 每个滤波器基元实现流寄存器比较逻辑,其确定接收的窥探请求是否被转发到处理器或丢弃。

    Method and apparatus for detecting a cache wrap condition
    4.
    发明申请
    Method and apparatus for detecting a cache wrap condition 有权
    用于检测缓存包装条件的方法和装置

    公开(公告)号:US20060230239A1

    公开(公告)日:2006-10-12

    申请号:US11093132

    申请日:2005-03-29

    IPC分类号: G06F13/28

    CPC分类号: G06F12/0822 G06F12/0831

    摘要: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.

    摘要翻译: 一种用于在具有处理器和高速缓存的计算环境中检测高速缓存包装条件的方法和装置。 当高速缓存的全部内容相对于特定的启动状态被替换时,检测到缓存包装条件。 当缓存中的所有集合已被替换时,集合关联缓存被认为已被包装。 高速缓存包检测的起始点是先前高速缓存包装时高速缓存集的状态。 该方法和装置优选地在具有依赖于检测高速缓存包装条件的过滤机构的窥探过滤器中实现。 这些需要该信息的窥探过滤机构可操作地与适用于检测高速缓存包裹事件的高速缓存包检测逻辑耦合,并且向窥探过滤机构执行指示步骤。 在各种实施例中,使用寄存器和比较器,可加载计数器或记分板数据结构来实现高速缓存封包检测逻辑。

    Method and apparatus for filtering snoop requests using a scoreboard
    5.
    发明申请
    Method and apparatus for filtering snoop requests using a scoreboard 失效
    使用记分板过滤窥探请求的方法和装置

    公开(公告)号:US20060224840A1

    公开(公告)日:2006-10-05

    申请号:US11093160

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: An apparatus for implementing snooping cache coherence that locally reduces the number of snoop requests presented to each cache in a multiprocessor system. A snoop filter device associated with a single processor includes one or more “scoreboard” data structures that make snoop determinations, i.e., for each snoop request from another processor, to determine if a request is to be forwarded to the processor or, discarded. At least one scoreboard is active, and at least one scoreboard is determined to be historic at any point in time. A snoop determination of the queue indicates that an entry may be in the cache, but does not indicate its actual residence status. In addition, the snoop filter block implementing scoreboard data structures is operatively coupled with a cache wrap detection logic means whereby, upon detection of a cache wrap condition, the content of the active scoreboard is copied into a historic scoreboard and the content of at least one active scoreboard is reset.

    摘要翻译: 用于实现窥探高速缓存一致性的装置,其本地地减少呈现给多处理器系统中的每个缓存的窥探请求的数量。 与单个处理器相关联的窥探过滤器装置包括一个或多个“记分板”数据结构,其进行窥探确定,即,来自另一个处理器的每个窥探请求,以确定请求是否被转发到处理器或被丢弃。 至少一个记分牌是活跃的,并且至少一个记分牌被确定为在任何时间点的历史。 队列的窥探确定表示一个条目可能在缓存中,但不表示其实际居住状态。 此外,实现记分板数据结构的窥探过滤器块与高速缓存包检测逻辑装置可操作地耦合,由此在检测到缓存包装条件时,将活动记分板的内容复制到历史记分板中,并且至少一个 活动记分板重置。

    Novel snoop filter for filtering snoop requests
    6.
    发明申请
    Novel snoop filter for filtering snoop requests 有权
    用于过滤窥探请求的新型窥探过滤器

    公开(公告)号:US20060224838A1

    公开(公告)日:2006-10-05

    申请号:US11093152

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

    摘要翻译: 一种用于在具有多个处理单元的多处理器计算环境中支持高速缓存一致性的方法和装置,每个处理单元具有与其相关联并与其可操作地相连的一个或多个本地高速缓冲存储器。 该方法包括提供与每个处理单元相关联的窥探过滤器设备,每个窥探过滤器设备具有多个专用输入端口,用于从多处理器计算环境中的专用存储器写入源接收窥探请求。 每个窥探过滤器装置包括与多个专用输入端口相对应的多个并行操作端口窥探滤波器,每个端口窥探滤波器实现一个或多个并行操作子滤波器元件,其适于同时滤除从相应专用存储器接收的窥探请求 写入源并将这些请求的子集转发到其相关联的处理单元。

    Snoop filtering system in a multiprocessor system
    7.
    发明申请
    Snoop filtering system in a multiprocessor system 有权
    多处理器系统中的Snoop过滤系统

    公开(公告)号:US20060224835A1

    公开(公告)日:2006-10-05

    申请号:US11093127

    申请日:2005-03-29

    IPC分类号: G06F13/28

    摘要: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit. Each of the plurality of parallel operating filter devices comprises parallel operating sub-filter elements, each simultaneously receiving an identical snoop request and implementing one or more different snoop filter algorithms for determining those snoop requests for data that are determined not cached locally at the associated processing unit and preventing forwarding of those requests to the processor unit. In this manner, a number of snoop requests forwarded to a processing unit is reduced thereby increasing performance of the computing environment.

    摘要翻译: 一种用于在具有多个处理单元的计算环境中支持高速缓存一致性的系统和方法,每个单元具有与其可操作耦合的相关联的高速缓存存储器系统 该系统包括多个互连的窥探过滤器单元,每个窥探过滤器单元对应于相应处理单元并与其通信,每个窥探过滤器单元包括用于在计算环境中从相应存储器写入源接收异步窥探请求的多个设备 ; 以及包括用于将存储器写入源直接连接到对应的接收设备的通信链路的点对点互连; 以及与每个接收设备一一对应地耦合的多个并行操作过滤器设备,用于处理在其上接收的窥探请求,并且转发请求之一或者阻止将请求转发到其相关联的处理单元。 多个并行操作过滤器装置中的每一个包括并行操作子滤波器元件,每个并行操作子滤波器元件同时接收相同的窥探请求,并且实现一个或多个不同的窥探滤波器算法,用于确定对于在相关处理中本地未被缓存的数据被确定的窥探请求 并且防止将这些请求转发到处理器单元。 以这种方式,减少了转发到处理单元的多个窥探请求,从而增加了计算环境的性能。

    Hardware support for collecting performance counters directly to memory
    8.
    发明授权
    Hardware support for collecting performance counters directly to memory 失效
    硬件支持将性能计数器直接收集到内存中

    公开(公告)号:US08275964B2

    公开(公告)日:2012-09-25

    申请号:US12684172

    申请日:2010-01-08

    IPC分类号: G06F12/00

    CPC分类号: G06F11/348 G06F2201/88

    摘要: Hardware support for collecting performance counters directly to memory, in one aspect, may include a plurality of performance counters operable to collect one or more counts of one or more selected activities. A first storage element may be operable to store an address of a memory location. A second storage element may be operable to store a value indicating whether the hardware should begin copying. A state machine may be operable to detect the value in the second storage element and trigger hardware copying of data in selected one or more of the plurality of performance counters to the memory location whose address is stored in the first storage element.

    摘要翻译: 在一个方面,在性能计数器直接收集到存储器的硬件​​支持可以包括多个性能计数器,可操作以收集一个或多个所选活动的一个或多个计数。 第一存储元件可以用于存储存储器位置的地址。 第二存储元件可以用于存储指示硬件是否应该开始复制的值。 状态机可操作用于检测第二存储元件中的值,并且触发多个性能计数器中所选择的一个或多个性能计数器中的数据的硬件复制到其地址存储在第一存储元件中的存储单元。

    Using DMA for copying performance counter data to memory
    9.
    发明授权
    Using DMA for copying performance counter data to memory 失效
    使用DMA将性能计数器数据复制到存储器

    公开(公告)号:US08275954B2

    公开(公告)日:2012-09-25

    申请号:US12684367

    申请日:2010-01-08

    IPC分类号: G06F12/00

    摘要: A device for copying performance counter data includes hardware path that connects a direct memory access (DMA) unit to a plurality of hardware performance counters and a memory device. Software prepares an injection packet for the DMA unit to perform copying, while the software can perform other tasks. In one aspect, the software that prepares the injection packet runs on a processing core other than the core that gathers the hardware performance counter data.

    摘要翻译: 用于复制性能计数器数据的设备包括将直接存储器访问(DMA)单元连接到多个硬件性能计数器和存储器设备的硬件路径。 软件为DMA单元准备一个注入数据包来执行复制,而软件可以执行其他任务。 在一个方面,准备注射分组的软件在收集硬件性能计数器数据的核心以外的处理核上运行。

    MULTIPROCESSOR SWITCH WITH SELECTIVE PAIRING
    10.
    发明申请
    MULTIPROCESSOR SWITCH WITH SELECTIVE PAIRING 失效
    具有选择性配对的多处理器开关

    公开(公告)号:US20120210172A1

    公开(公告)日:2012-08-16

    申请号:US13027882

    申请日:2011-02-15

    IPC分类号: G06F11/07

    摘要: System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory “nest” (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus.

    摘要翻译: 用于多处理系统的系统,方法和计算机程序产品,以提供处理器核心的选择性配对,以提高处理可靠性。 提供选择性配对设施,其选择性地连接,即配对多个微处理器或处理器核,以提供一个高度可靠的线程(或线程组)。 每个成对的微处理器或处理器核心提供一个高度可靠的线程,用于高可靠性与诸如存储器“嵌套”(或存储器层级),可选系统控制器和可选中断控制器的系统组件连接,可选的I / O或外设 设备等。存储器套件通过开关或总线连接到选择性配对设施。