Data processing system with fully interconnected system architecture (FISA)
    31.
    发明授权
    Data processing system with fully interconnected system architecture (FISA) 失效
    具有完全互连系统架构(FISA)的数据处理系统

    公开(公告)号:US06553447B1

    公开(公告)日:2003-04-22

    申请号:US09437194

    申请日:1999-11-09

    IPC分类号: G06F1300

    CPC分类号: G06F13/4273

    摘要: A Fully Interconnected System Architecture (FISA) for an improved data processing system. The data processing system topology has a processor chip and external components to the processor chip, such as memory and input/output (I/O) and other processor chips. The processor chip is interconnected to the external components via a point-to-point bus topology controlled by an intra-chip integrated, distributed switch (IDS) controller. The IDS controller provides the chip with the functionality to provide a single bus to each external component and provides an overall total bandwidth greater than traditional topologies while reducing latencies between the processor and the external components. The design of the processor chip with the intra-chip IDS controller provides a pseudo “distributed switch” which may separately access distributed external components, such as memory and I/Os, etc.

    摘要翻译: 完全互连的系统架构(FISA),用于改进的数据处理系统。 数据处理系统拓扑具有处理器芯片和处理器芯片的外部组件,例如存储器和输入/输出(I / O)等处理器芯片。 处理器芯片通过由片上集成分布式交换机(IDS)控制器控制的点对点总线拓扑与外部组件互连。 IDS控制器为芯片提供功能,为每个外部组件提供单个总线,并提供比传统拓扑更大的总体带宽,同时减少处理器和外部组件之间的延迟。 具有片内IDS控制器的处理器芯片的设计提供了一种伪“分布式交换机”,可以分别访问分布式的外部组件,如存储器和I / O等。

    Multiprocessor system with a high performance integrated distributed switch (IDS) controller
    32.
    发明授权
    Multiprocessor system with a high performance integrated distributed switch (IDS) controller 失效
    具有高性能集成分布式交换机(IDS)控制器的多处理器系统

    公开(公告)号:US06415424B1

    公开(公告)日:2002-07-02

    申请号:US09437195

    申请日:1999-11-09

    IPC分类号: G06F1750

    CPC分类号: G06F15/7832

    摘要: A data processing system having a modified processor chip and external components to the processor chip. The processor chip is interconnected to the external components via point-to-point bus connections controlled by an integrated distributed switch (IDS) controller. The IDS controller is placed, during chip design, in the upper layer metals of the processor chip. When the data processing system is a multi-chip multiprocessor data processing system, the IDS controller operates to provide a pseudo switching effect whereby the processor is directly connected to each external component. The IDS controller permits the processor to have greater communication bandwidth and reduced latencies with the external components. It also allows for a connection to distributed external components such as memory and I/O, etc. with overall reduced system components.

    摘要翻译: 一种数据处理系统,其具有修改的处理器芯片和处理器芯片的外部组件。 处理器芯片通过由集成分布式交换机(IDS)控制器控制的点对点总线连接与外部组件互连。 IDS控制器在芯片设计期间放置在处理器芯片的上层金属中。 当数据处理系统是多芯片多处理器数据处理系统时,IDS控制器操作以提供伪切换效果,由此处理器直接连接到每个外部组件。 IDS控制器允许处理器具有更大的通信带宽和减少外部组件的延迟。 它还允许与分布式外部组件(如存储器和I / O等)连接,并具有整体减少的系统组件。

    Software-managed programmable associativity caching mechanism monitoring
cache misses to selectively implement multiple associativity levels
    33.
    发明授权
    Software-managed programmable associativity caching mechanism monitoring cache misses to selectively implement multiple associativity levels 失效
    软件管理的可编程组合缓存机制监控高速缓存未命中以选择性地实现多个关联级别

    公开(公告)号:US6026470A

    公开(公告)日:2000-02-15

    申请号:US839546

    申请日:1997-04-14

    IPC分类号: G06F12/08 G06F13/00

    CPC分类号: G06F12/0864

    摘要: A method of providing programmable associativity in a cache used by a processor of a computer system is disclosed. A congruence class of a memory block is defined using a first mapping function, providing a first associativity level of the cache. Program instructions in the processor select a second associativity level of a known appropriate level, and implement the second associativity level in the cache using a second mapping function. Application software may provide the program instructions, wherein the application software has procedures that may result in cache "strides" at particular associativity levels, and the known appropriate level is chosen to lessen memory latencies due to strides. Alternatively, the program instructions may be part of an operating system which monitors memory address requests, determines how efficient a procedure will operate at different associativity levels, and selects a most efficient level for the known appropriate level. The program instructions may select the associativity level by setting a value in a bit facility corresponding to the desired mapping function.

    摘要翻译: 公开了一种在由计算机系统的处理器使用的高速缓存中提供可编程关联性的方法。 使用第一映射函数来定义存储器块的同余类,从而提供高速缓存的第一组合级别。 处理器中的程序指令选择已知适当级别的第二关联级别,并且使用第二映射函数来实现高速缓存中的第二关联级别。 应用软件可以提供程序指令,其中应用软件具有可能导致高速缓存在特定关联性级别“大步”的过程,并且选择已知的适当级别以减少由于步幅引起的存储器延迟。 或者,程序指令可以是监视存储器地址请求的操作系统的一部分,确定程序在不同的关联级别下的操作有效性,并为已知的适当级别选择最有效的级别。 程序指令可以通过设置与期望映射函数相对应的位设施中的值来选择关联性级别。

    Hardware-managed programmable associativity caching mechanism monitoring
cache misses to selectively implement multiple associativity levels
    34.
    发明授权
    Hardware-managed programmable associativity caching mechanism monitoring cache misses to selectively implement multiple associativity levels 失效
    硬件管理的可编程组合缓存机制监控高速缓存未命中,以选择性地实现多个组合级别

    公开(公告)号:US5978888A

    公开(公告)日:1999-11-02

    申请号:US839550

    申请日:1997-04-14

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0864

    摘要: A method of providing programmable associativity in a cache used by a processor of a computer system is disclosed. A congruence class of a memory block is defined using a first mapping function, providing a first associativity level of the cache. A logic unit connected to the cache monitors cache misses as the cache uses the first associativity level, and selects other associativity levels based on the cache misses, using other mapping functions. The logic unit has incorporated therein means for selecting the other associativity levels based on a rate of the cache misses in a particular congruence class. The congruence class may be defined by associating the memory block with a particular set of cache blocks in the cache, based on a first portion of an address of the memory block, and the other mapping functions may be implemented by dividing the particular set into subsets and selecting a subset for the memory block based on a second portion of the address.

    摘要翻译: 公开了一种在由计算机系统的处理器使用的高速缓存中提供可编程关联性的方法。 使用第一映射函数来定义存储器块的同余类,从而提供高速缓存的第一组合级别。 连接到高速缓存的逻辑单元监视高速缓存未命中,因为高速缓存使用第一关联性级别,并且使用其他映射功能,基于高速缓存未命中选择其他关联性级别。 逻辑单元已经结合有用于基于特定同余类中的高速缓存未命中的速率来选择其他关联性级别的装置。 可以通过基于存储器块的地址的第一部分将存储器块与高速缓存中的特定的一组高速缓存块相关联来定义同余类,并且可以通过将特定集合划分为子集来实现其他映射函数 以及基于所述地址的第二部分来选择所述存储器块的子集。

    Optimizing a cache eviction mechanism by selectively introducing
different levels of randomness into a replacement algorithm
    35.
    发明授权
    Optimizing a cache eviction mechanism by selectively introducing different levels of randomness into a replacement algorithm 失效
    通过选择性地将不同级别的随机性引入替换算法来优化缓存驱逐机制

    公开(公告)号:US5974507A

    公开(公告)日:1999-10-26

    申请号:US837512

    申请日:1997-04-14

    IPC分类号: G06F12/08 G06F12/12 G06F12/02

    摘要: A method of improving operation of a cache used by a processor of a computer system by introducing a level of randomness into a replacement algorithm used by the cache in order to lessen "strides" within the cache is disclosed. Different levels of randomness may be introduced into the replacement algorithm at different times to optimize the cache for different procedures running on the processor. The level of randomness can be selectively introduced by using a basic replacement algorithm to select a subset of a congruence class, and one or more random bits are then used to select a specific cache block within the subset for eviction. The basic replacement algorithm can be a least recently used algorithm. There may be three levels of randomness for a 4-way set associative cache, and there may be four levels of randomness for an 8-way set associative cache.

    摘要翻译: 公开了一种改进由计算机系统的处理器使用的缓存的操作的方法,其通过将高级别的随机性引入由高速缓存使用的替换算法中以减轻高速缓存内的“跨步”。 可以在不同时间将不同级别的随机性引入到替换算法中,以针对在处理器上运行的不同过程优化高速缓存。 可以通过使用基本替换算法选择一致等级的子集来选择性地引入随机性的水平,然后使用一个或多个随机比特来选择该子集内的特定高速缓存块进行驱逐。 基本替换算法可以是最近最少使用的算法。 对于4路组关联缓存,可能存在三个随机性级别,并且对于8路组关联高速缓存可能存在四个级别的随机性。

    Communication method for integrated circuit chips on a multi-chip module
    36.
    发明授权
    Communication method for integrated circuit chips on a multi-chip module 失效
    多芯片模块上集成电路芯片的通信方法

    公开(公告)号:US06463497B1

    公开(公告)日:2002-10-08

    申请号:US09364697

    申请日:1999-07-30

    IPC分类号: G06F1314

    CPC分类号: G06F15/16

    摘要: A signal is transmitted from a sending chip to a first receiving chip in a communications ring via a first i/o set of the sending chip. A signal from the sending chip to a second receiving chip in the communications ring is transmitted via a second i/o set of the sending chip. The first i/o set corresponds to a first direction for the sending chip transmitting around the ring, and the second i/o set corresponds to a second direction for the sending chip transmitting around the ring. The transmitting via the first i/o set is for a circumstance where a number of chips interposed in the ring between the sending and receiving chips in the first direction is not greater than the number of chips interposed in the second direction. The transmitting via the second i/o set is for a circumstance where the number is greater. For a chip interposed between the sending and receiving chips, the transmitting includes traversing from one of the first and second i/o sets of the at least one interposed chip, to the other one of the first and second i/o sets of the at least one interposed chip. The signal traversing the at least one interposed chip is regenerated by the interposed chip.

    摘要翻译: 信号通过发送芯片的第一i / o组从发送芯片发送到通信环中的第一接收芯片。 来自发送芯片的信号到通信环中的第二接收芯片经由发送芯片的第二i / o集发送。 第一i / o集合对应于围绕环传送的发送芯片的第一方向,并且第二I / O设置对应于围绕环传输的发送芯片的第二方向。 经由第一i / o集合的发送是针对在第一方向上插入在发送和接收码片之间的环中的芯片数量不大于插入在第二方向上的码片数量的情况。 通过第二个I / O集合的发送是在数量更大的情况下。 对于插在发送和接收芯片之间的芯片,发送包括从至少一个插入芯片的第一和第二I / O组中的一个移动到第一和第二I / O组中的另一个 至少一个插入的芯片。 穿过至少一个插入的芯片的信号由插入的芯片再生。

    Integrated cache and directory structure for multi-level caches
    37.
    发明授权
    Integrated cache and directory structure for multi-level caches 失效
    多级缓存的集成缓存和目录结构

    公开(公告)号:US06473833B1

    公开(公告)日:2002-10-29

    申请号:US09364570

    申请日:1999-07-30

    IPC分类号: G06F1200

    CPC分类号: G06F12/0897

    摘要: A method of operating a multi-level memory hierarchy of a computer system and an apparatus embodying the method, wherein multiple levels of storage subsystems are used to improve the performance of the computer system, each next higher level generally having a faster access time, but a smaller amount of storage. Values within a level are indexed by a directory that provides an indexing of information relating the values in that level to the next lower level. In a preferred embodiment of the invention, the directories for the various levels of storage are contained within the next higher level, providing a faster access to the directory information. Cache memories used as the highest levels of storage, and one or more sets are allocated out of that cache memory for containing a directory of the next lower level of storage. An address comparator which is used to compare entries in a directory to address values is directly coupled to the set or sets used for the directory, reducing the time needed to compare addresses in determining whether an address is present in the cache.

    摘要翻译: 一种操作计算机系统的多级存储器层级的方法和体现该方法的装置,其中使用多级存储子系统来提高计算机系统的性能,每个下一级别通常具有更快的访问时间,但是 更少的存储空间。 级别中的值由一个目录索引,该目录提供与该级别中的值相关联的信息与下一级别的索引。 在本发明的优选实施例中,用于各种级别的存储的目录被包含在下一较高级别内,从而提供对目录信息的更快访问。 高速缓冲存储器被用作最高级别的存储器,并且从该高速缓存存储器中分配一个或多个集合,用于包含下一较低级存储的目录。 用于将目录中的条目与地址值进行比较的地址比较器直接耦合到用于目录的集合或集合,减少了在确定地址是否存在于高速缓存中时比较地址所需的时间。

    Dynamically configurable memory bus and scalability ports via hardware monitored bus utilizations
    38.
    发明授权
    Dynamically configurable memory bus and scalability ports via hardware monitored bus utilizations 失效
    通过硬件监控的总线利用率,可动态配置的内存总线和可扩展端口

    公开(公告)号:US06535939B1

    公开(公告)日:2003-03-18

    申请号:US09436418

    申请日:1999-11-09

    IPC分类号: G06F1314

    CPC分类号: G06F13/36

    摘要: A data processing system with configurable processor chip buses. The processor chip is designed with a bus allocation unit and has a plurality of extended buses of which a number are configurable buses (i.e. may be dynamically allocated to any one of several external components, particularly memory and other SMPs). A priority determination of bandwidth requirements of the external components is made during system processing. Then the configurable buses are dynamically allocated to the external components based on their bandwidth requirement and/or the configuration which provides the best overall system efficiency.

    摘要翻译: 具有可配置处理器芯片总线的数据处理系统。 处理器芯片设计有总线分配单元,并且具有多个扩展总线,其中数字是可配置总线(即可以动态地分配给多个外部组件中的任何一个,特别是存储器和其他SMP)。 在系统处理期间,对外部组件的带宽需求进行优先确定。 然后,可配置总线根据其带宽要求和/或提供最佳整体系统效率的配置动态地分配给外部组件。

    L2 cache array topology for large cache with different latency domains
    40.
    发明授权
    L2 cache array topology for large cache with different latency domains 有权
    具有不同延迟域的大型缓存的L2缓存阵列拓扑

    公开(公告)号:US07783834B2

    公开(公告)日:2010-08-24

    申请号:US11947742

    申请日:2007-11-29

    IPC分类号: G06F12/00

    摘要: A cache memory logically associates a cache line with at least two cache sectors of a cache array wherein different sectors have different output latencies and, for a load hit, selectively enables the cache sectors based on their latency to output the cache line over successive clock cycles. Larger wires having a higher transmission speed are preferably used to output the cache line corresponding to the requested memory block. In the illustrative embodiment the cache is arranged with rows and columns of the cache sectors, and a given cache line is spread across sectors in different columns, with at least one portion of the given cache line being located in a first column having a first latency, and another portion of the given cache line being located in a second column having a second latency greater than the first latency. One set of wires oriented along a horizontal direction may be used to output the cache line, while another set of wires oriented along a vertical direction may be used for maintenance of the cache sectors. A given cache line is further preferably spread across sectors in different rows or cache ways. For example, a cache line can be 128 bytes and spread across four sectors in four different columns, each sector containing 32 bytes of the cache line, and the cache line is output over four successive clock cycles with one sector being transmitted during each of the four cycles.

    摘要翻译: 缓存存储器逻辑地将高速缓存行与高速缓存阵列的至少两个缓存扇区相关联,其中不同扇区具有不同的输出延迟,并且对于负载命中,基于它们的等待时间来选择性地启用高速缓存扇区以在连续的时钟周期上输出高速缓存行 。 优选使用具有较高传输速度的较大导线来输出与所请求的存储块相对应的高速缓存行。 在说明性实施例中,高速缓存器配置有高速缓存扇区的行和列,并且给定的高速缓存行分布在不同列中的扇区之间,其中给定高速缓存行的至少一部分位于具有第一等待时间的第一列中 并且所述给定高速缓存行的另一部分位于具有大于所述第一等待时间的第二等待时间的第二列中。 可以使用沿水平方向定向的一组线来输出高速缓存线,而沿着垂直方向定向的另一组线可以用于高速缓存扇区的维护。 给定的高速缓存行进一步优选地分布在不同行或高速缓存方式的扇区之间。 例如,高速缓存行可以是128字节并且分布在四个不同列中的四个扇区上,每个扇区包含32个字节的高速缓存行,并且高速缓存行在四个连续的时钟周期内被输出,在每个 四个周期。