Programmable architecture fast packet switch
    1.
    发明授权
    Programmable architecture fast packet switch 失效
    可编程架构快速分组交换机

    公开(公告)号:US06275491B1

    公开(公告)日:2001-08-14

    申请号:US09086779

    申请日:1998-05-28

    IPC分类号: H04J324

    摘要: A programmable fast packet switch testbed (10) for use in the evaluation of prototype architectures and traffic management algorithms is disclosed. The programmable switch (10) is arranged as an add-on peripheral to a conventional computer system including a host central processing unit (CPU) (2). The switch (10) includes a plurality of port processors (14) in communication with port interfaces (12); each of the port interfaces (12) is a conventional interface for high data rate communication, while the port processors (14) are programmable logic devices. The switch fabric is realized in a multiple slice fashion, by multiple programmable logic devices (18). A central arbiter (30), also realized in programmable logic, controls routing of cells within the switch (10). Programming of the port processors (14), fabric slices (18), and arbiter (30) is effected by downloading, into these devices, bit-streams supplied by the host CPU (2) that define the switch architecture, including selection of input or output queuing and the fabric type, along with the implementation of traffic management algorithms in the port processors (14), fabric slices (18), and arbiter (30). Each of the port processors (14), fabric slices (18), and arbiter (30) also contain memory locations for storing results of operation, which are read by the management port (24) over a management bus (COMET), and may then be forwarded to the host CPU (2), without interfering with switch traffic. The programmable switch (10) is therefore capable of full speed operation as a fast packet switch, thus providing accurate evaluation results.

    摘要翻译: 公开了一种可用于评估原型架构和流量管理算法的可编程快速分组交换机测试台(10)。 可编程开关(10)被布置为包括主机中央处理单元(CPU)(2)的常规计算机系统的附加外围设备。 交换机(10)包括与端口接口(12)通信的多个端口处理器(14)。 每个端口接口(12)是用于高数据速率通信的常规接口,而端口处理器(14)是可编程逻辑器件。 交换结构由多个可编程逻辑器件(18)以多切片方式实现。 在可编程逻辑中实现的中央仲裁器(30)控制开关(10)内的单元的布线。 端口处理器(14),结构片(18)和仲裁器(30)的编程通过将定义开关结构的主机CPU(2)提供的比特流下载到这些设备中来实现,包括选择输入 或者输出队列和结构类型,以及端口处理器(14),结构片(18)和仲裁器(30)中的流量管理算法的实现。 每个端口处理器(14),结构片(18)和仲裁器(30)还包含用于存储由管理端口(24)经由管理总线(COMET)读取的操作结果的存储器位置,并且可以 然后转发到主机CPU(2),而不会干扰交换机流量。 因此,可编程开关(10)能够作为快速分组交换机进行全速操作,从而提供准确的评估结果。

    Scalable multistage interconnection network architecture and method for
performing in-service upgrade thereof
    3.
    发明授权
    Scalable multistage interconnection network architecture and method for performing in-service upgrade thereof 失效
    可扩展的多级互连网络架构和方法,用于执行其在线升级

    公开(公告)号:US6049542A

    公开(公告)日:2000-04-11

    申请号:US2042

    申请日:1997-12-31

    申请人: Sharat C. Prasad

    发明人: Sharat C. Prasad

    摘要: There is disclosed a scalable switch fabric architecture comprising: 1) an input switching stage having N inputs and N outputs operable to connect selected ones of the N inputs to selected ones of the N outputs; 2) an output switching stage having M inputs and M outputs operable to connect selected ones of the M inputs to selected ones of the M outputs; 3) a multiplexer stage having a plurality of W-bit input channels and a W-bit output channel, wherein the output channel is coupled to the M inputs of the output switching stage; and 4) a removable core switching stage having N inputs adapted for coupling to the N outputs of the input switching stage and having M outputs adapted for coupling to a first input channel of the multiplexer stage.

    摘要翻译: 公开了一种可扩展交换结构架构,包括:1)具有N个输入和N个输出的输入切换级,可操作以将N个输入中的选定的输入连接到N个输出中的选定的输入; 2)具有M个输入和M个输出的输出切换级,可操作以将所述M个输入中的选择的M个输入连接到所述M个输出中的所选择的输入; 3)多路复用器级,具有多个W位输入通道和W位输出通道,其中输出通道耦合到输出开关级的M个输入; 以及4)可移动核心切换级,其具有适于耦合到所述输入切换级的N个输出的N个输入,并且具有适于耦合到所述多路复用器级的第一输入通道的M个输出。

    On-chip and system-area multi-processor interconnection networks in advanced processes for maximizing performance minimizing cost and energy

    公开(公告)号:US10733350B1

    公开(公告)日:2020-08-04

    申请号:US16375684

    申请日:2019-04-04

    摘要: A chip design environment is disclosed which accepts application specific processing, memory and IO elements and declarative specification of function, cost and performance of peripheral, low-level and infrastructural elements and of overall design and generates synthesizable module RTLs and relevant place-and-route constraints. The generated elements include the network interconnecting all the elements, a programming memory consistency model and its coherence protocol, allocation and scheduling processes realizing run-time inference of optimal parallel execution and processes for control of coherence action and prefetch intensity, task-data migration, voltage-frequency scaling and power-clock gating. The environment employs knowledge bases, models to predict performance and to assign confidence scores to predictions and, in turn, the predictions to explore space of topology, architecture, composition, etc options. The environment generates synthesizable module RTLs to complete the design and relevant place-and-route constraints. User may simulate the synthesized design. If a user shares simulation results, the environment may evaluate the predicted performance against performance determined by simulation and use the results to update its knowledge and models.