System and method for initiating a serial data transfer between two clock domains
    31.
    发明授权
    System and method for initiating a serial data transfer between two clock domains 有权
    用于启动两个时钟域之间的串行数据传输的系统和方法

    公开(公告)号:US06393502B1

    公开(公告)日:2002-05-21

    申请号:US09386650

    申请日:1999-08-31

    IPC分类号: G06F1314

    CPC分类号: G06F13/4059

    摘要: A system and method for transferring a data stream between devices having different clock domains. The method initiates a serial data stream between a transmitter and a receiver. The transmitter operates according to a first clock having a first clock rate, and the receiver operates according to a second clock having a second clock rate. A ratio between the second clock rate and the first clock rate is an integer number greater than or equal to one. A first state is provided over a serial line between the transmitter and the receiver. One or more start bits are provided over the serial line. The start bits indicate a second state different from the first state. One or more ratio bits are provided over the serial line after the start bit. The ratio bits indicate the ratio between the second clock rate and the first clock rate. The start bits are received. Using a transition between the first state and the second state evident in receiving each of the start bits, the ratio bits are received. The remainder of the serial data stream is received at appropriate intervals of the second clock rate.

    摘要翻译: 一种用于在具有不同时钟域的设备之间传送数据流的系统和方法。 该方法在发射机和接收机之间发起串行数据流。 发射机根据具有第一时钟速率的第一时钟进行操作,并且接收机根据具有第二时钟速率的第二时钟进行操作。 第二时钟速率和第一时钟速率之间的比率是大于或等于1的整数。 在发射机和接收机之间的串行线路上提供第一状态。 在串行线路上提供一个或多个起始位。 起始位表示与第一状态不同的第二状态。 在起始位之后,串行线上提供一个或多个比特比特。 比率比特指示第二时钟速率和第一时钟速率之间的比率。 接收到起始位。 在接收每个起始位时,使用第一状态和第二状态之间的转换是明显的,接收比率比特。 以第二时钟速率的适当间隔接收串行数据流的其余部分。

    Method and apparatus for a dedicated physically indexed copy of the data cache tag arrays
    32.
    发明授权
    Method and apparatus for a dedicated physically indexed copy of the data cache tag arrays 失效
    用于数据高速缓存标签阵列的专用物理索引副本的方法和装置

    公开(公告)号:US06253301B1

    公开(公告)日:2001-06-26

    申请号:US09061626

    申请日:1998-04-16

    IPC分类号: G06F1215

    摘要: A data caching system and method includes a data store for caching data from a main memory, a primary tag array for holding tags associated with data cached in the data store, and a duplicate tag array which holds copies of the tags held in the primary tag array. The duplicate tag array is accessible by functions, such as external memory cache probes, such that the primary tag remains available to the processor core. An address translator maps virtual page addresses to physical page address. In order to allow a data caching system which is larger than a page size, a portion of the virtual page address is used to index the tag arrays and data store. However, because of the virtual to physical mapping, the data may reside in any of a number of physical locations. During an internally-generated memory access, the virtual address is used to look up the cache. If there is a miss, other combinations of values are substituted for the virtual bits of the tag array index. For external probes which provide physical addresses to the duplicate tag array, combinations of values are appended to the index portion of the physical address. Tag array lookups can be performed either sequentially, or in parallel.

    摘要翻译: 数据缓存系统和方法包括用于缓存来自主存储器的数据的数据存储器,用于保存与缓存在数据存储器中的数据相关联的标签的主标签阵列,以及保存在主标签中的标签副本的重复标签阵列 数组。 重复的标签阵列可以通过诸如外部存储器高速缓存探测器的功能访问,使得主标签对于处理器核心仍然可用。 地址转换器将虚拟页面地址映射到物理页面地址。 为了允许大于页面大小的数据缓存系统,虚拟页面地址的一部分用于对标签数组和数据存储进行索引。 然而,由于虚拟到物理映射,数据可能驻留在多个物理位置中的任何一个中。 在内部生成的内存访问期间,虚拟地址用于查找缓存。 如果存在缺失,则代替标签数组索引的虚拟位的值的其他组合。 对于为重复标签数组提供物理地址的外部探测器,值的组合将附加到物理地址的索引部分。 标签阵列查找可以顺序地或并行地执行。

    Apparatus for exchanging two stack registers
    33.
    发明授权
    Apparatus for exchanging two stack registers 失效
    用于交换两个堆栈寄存器的装置

    公开(公告)号:US6112018A

    公开(公告)日:2000-08-29

    申请号:US992804

    申请日:1997-12-18

    IPC分类号: G06F9/30 G06F9/38 G06F12/00

    摘要: A floating point unit capable of executing multiple instructions in a single clock cycle using a central window and a register map is disclosed. The floating point unit comprises: a plurality of translation units, a future file, a central window, a plurality of functional units, a result queue, and a plurality of physical registers. The floating point unit receives speculative instructions, decodes them, and then stores them in the central window. Speculative top of stack values are generated for each instruction during decoding. Top of stack relative operands are computed to physical registers using a register map. Register stack exchange operations are performed during decoding. Instructions are then stored in the central window, which selects the oldest stored instructions to be issued to each functional pipeline and issues them. Conversion units convert the instruction's operands to an internal format, and normalization units detect and normalize any denormal operands. Finally, the functional pipelines execute the instructions.

    摘要翻译: 公开了一种能够使用中央窗口和寄存器映射在单个时钟周期中执行多个指令的浮点单元。 浮点单元包括:多个翻译单元,未来文件,中央窗口,多个功能单元,结果队列和多个物理寄存器。 浮点单元接收推测指令,对它们进行解码,然后将其存储在中央窗口中。 在解码过程中,每个指令产生堆栈值的推测顶点。 堆栈顶部相对操作数使用寄存器映射计算到物理寄存器。 在解码期间执行寄存器堆栈交换操作。 然后将指令存储在中央窗口中,其中选择要发布到每个功能管道的最早存储的指令并发出它们。 转换单位将指令的操作数转换为内部格式,归一化单元检测和归一化任何反常操作数。 最后,功能管线执行指令。

    Method and apparatus for maximizing utilization of an internal processor
bus in the context of external transactions running at speeds
fractionally greater than internal transaction times
    34.
    发明授权
    Method and apparatus for maximizing utilization of an internal processor bus in the context of external transactions running at speeds fractionally greater than internal transaction times 失效
    在内部处理器总线的上下文中最大化利用率的方法和装置,其运行速度大于内部事务时间

    公开(公告)号:US5924120A

    公开(公告)日:1999-07-13

    申请号:US18320

    申请日:1998-02-03

    CPC分类号: G06F12/0897 G06F9/3869

    摘要: Use of an internal processor data bus is maximized in a system where external transactions may occur at a rate which is fractionally slower than the rate of the internal transactions. The technique inserts a selectable delay element in the signal path during an external operation such as a cache fill operation. The one cycle delay provides a time slot in which an internal operation, such as a load from an internal cache, may be performed. This technique therefore permits full use of the time slots on the internal data bus. It can, for, example, allow load operations to begin at a much earlier time than would otherwise be possible in architectures where fill operations can consume multiple bus time slots.

    摘要翻译: 在内部处理器数据总线的使用最大化的系统中,外部事务可能以比内部事务的速度慢的速度发生。 该技术在诸如高速缓存填充操作的外部操作期间在信号路径中插入可选择的延迟元件。 一个周期延迟提供了可以执行内部操作(例如来自内部高速缓存的负载)的时隙。 因此,该技术允许充分利用内部数据总线上的时隙。 例如,它可以允许加载操作在比填充操作可以消耗多个总线时隙的体系结构中更早的时间开始。

    Error transition mode for multi-processor system
    35.
    发明授权
    Error transition mode for multi-processor system 失效
    多处理器系统的错误转换模式

    公开(公告)号:US5155843A

    公开(公告)日:1992-10-13

    申请号:US547597

    申请日:1990-06-29

    IPC分类号: F02B75/02 G06F12/08

    摘要: A pipelined CPU executing instructions of variable length, and referencing memory using various data widths. Macroinstruction pipelining is employed (instead of microinstruction pipelining), with queueing between units of the CPU to allow flexibility in instruction execution times. A wide bandwidth is available for memory access; fetching 64-bit data blocks on each cycle. A hierarchical cache arrangement has an improved method of cache set selection, increasing the likelihood of a cache hit. A writeback cache is used (instead of writethrough) and writeback is allowed to proceed even though other accesses are suppressed due to queues being full. A branch prediction method employs a branch history table which records the taken vs. not-taken history of branch opcodes recently used, and uses an empirical algorithm to predict which way the next occurrence of this branch will go, based upon the history table. A floating point processor function is integrated on-chip, with enhanced speed due to a bypass technique; a trial mini-rounding is done on low-order bits of the result, and if correct, the last stage of the floating point processor can be bypassed, saving one cycle of latency. For CAL type instructions, a method for determining which registers need to be saved is executed in a minimum number of cycles, examining groups of register mask bits at one time. Internal processor registers are accessed with short (byte width) addresses instead of full physical addresses as used for memory and I/O references, but off-chip processor registers are memory-mapped and accessed by the same busses using the same controls as the memory and I/O. In a non-recoverable error detected by ECC circuits in the cache, an error transition mode is entered wherein the cache operates under limited access rules, allowing a maximum of access by the system for data blocks owned by the cache, but yet minimizing changes to the cache data so that diagnostics may be run. Separate queues are provided for the return data from memory and cache invalidates, yet the order or bus transactions is maintained by a pointer arrangement. The bus protocol used by the CPU to communicate with the system bus is of the pended type, with transactions on the bus identified by an ID field specifying the originator, and arbitration for bus grant goes one simultaneously with address/data transactions on the bus.

    Method and mechanism for generating a clock signal with a relatively linear increase or decrease in clock frequency
    36.
    发明授权
    Method and mechanism for generating a clock signal with a relatively linear increase or decrease in clock frequency 有权
    用于产生时钟频率相对线性增加或减小的时钟信号的方法和机制

    公开(公告)号:US06988217B1

    公开(公告)日:2006-01-17

    申请号:US10084566

    申请日:2002-02-27

    IPC分类号: G06F1/04

    CPC分类号: G06F1/08

    摘要: A method and mechanism for generating a clock signal with a relatively linear increase or decrease in clock frequency. A first clock signal is generated with a first frequency which is then used to generate a second clock signal with a second frequency. The second frequency is generated by dropping selected pulses of the first clock signal. Particular patterns of bits are stored in a storage element. Bits are then selected and conveyed from the storage element at a frequency determined by the first clock signal. The conveyed bits are used to construct the second clock signal. By selecting the particular pattern of bits selected and conveyed, the frequency of the second clock signal may be determined. Further, by changing the patterns of bits within the registers at selected times, the frequency of the second clock signal may be made to change in a relatively linear manner.

    摘要翻译: 一种用于产生时钟频率相对线性增加或减小的时钟信号的方法和机制。 第一时钟信号以第一频率产生,然后用于产生具有第二频率的第二时钟信号。 通过丢弃第一时钟信号的选定脉冲来产生第二频率。 位的特定模式存储在存储元件中。 然后以由第一时钟信号确定的频率从存储元件选择和传送位。 所传送的比特用于构造第二时钟信号。 通过选择所选择和传送的比特的特定模式,可以确定第二时钟信号的频率。 此外,通过在选定的时间改变寄存器内的位的模式,可以使第二时钟信号的频率以相对线性的方式改变。

    Virtual channels and corresponding buffer allocations for deadlock-free computer system operation
    37.
    发明授权
    Virtual channels and corresponding buffer allocations for deadlock-free computer system operation 有权
    虚拟通道和相应的缓冲区分配,用于无死锁的计算机系统操作

    公开(公告)号:US06938094B1

    公开(公告)日:2005-08-30

    申请号:US09399281

    申请日:1999-09-17

    CPC分类号: G06F15/17381

    摘要: A computer system employs virtual channels and allocates different resources to the virtual channels. Packets which do not have logical/protocol-related conflicts are grouped into a virtual channel. Accordingly, logical conflicts occur between packets in separate virtual channels. The packets within a virtual channel may share resources (and hence experience resource conflicts), but the packets within different virtual channels may not share resources. Since packets which may experience resource conflicts do not experience logical conflicts, and since packets which may experience logical conflicts do not experience resource conflicts, deadlock-free operation may be achieved. Additionally, each virtual channel may be assigned control packet buffers and data packet buffers. Control packets may be substantially smaller in size, and may occur more frequently than data packets. By providing separate buffers, buffer space may be used efficiently. If a control packet which does not specify a data packet is received, no data packet buffer space is allocated. If a control packet which does specify a data packet is received, both control packet buffer space and data packet buffer space is allocated.

    摘要翻译: 计算机系统采用虚拟通道并为虚拟通道分配不同的资源。 没有逻辑/协议相关冲突的数据包被分组成虚拟通道。 因此,在分离的虚拟通道中的分组之间发生逻辑冲突。 虚拟通道内的数据包可能共享资源(从而遇到资源冲突),但不同虚拟通道内的数据包可能不共享资源。 由于可能遇到资源冲突的数据包不会发生逻辑冲突,并且由于可能遇到逻辑冲突的数据包不会遇到资源冲突,因此可能会实现无死锁操作。 另外,每个虚拟信道可以被分配控制分组缓冲器和数据分组缓冲器。 控制分组的大小可能会更小,并且可能比数据分组更频繁地发生。 通过提供单独的缓冲区,可以有效地使用缓冲区空间。 如果接收到不指定数据分组的控制分组,则不分配数据分组缓冲空间。 如果接收到指定数据分组的控制分组,则分配控制分组缓冲区空间和数据分组缓冲区空间。

    System and method of increasing bandwidth for issuing ordered transactions into a distributed communication system
    38.
    发明授权
    System and method of increasing bandwidth for issuing ordered transactions into a distributed communication system 有权
    增加带宽的系统和方法,用于将有序交易发布到分布式通信系统中

    公开(公告)号:US06745272B2

    公开(公告)日:2004-06-01

    申请号:US09826262

    申请日:2001-04-04

    IPC分类号: G06F1300

    CPC分类号: H04L1/1671

    摘要: A method and system of expediting issuance of a second request of a pair of ordered requests into a distributed coherent communication fabric. The first request of the ordered pair is issued into the coherent communication fabric and directed to a first target. Issuance of the second request into the coherent communication fabric is stalled until the first target receives and orders the first request and transmits a response acknowledging the same.

    摘要翻译: 一种将一对有序请求的第二请求发布到分布式相干通信结构中的方法和系统。 有序对的第一个请求被发布到相干通信结构中,并被引导到第一个目标。 将第二个请求发送到相干通信结构中停止,直到第一个目标接收并订购第一个请求并发送确认该响应的响应。

    Computer system implementing system and method for ordering write operations and maintaining memory coherency
    39.
    发明授权
    Computer system implementing system and method for ordering write operations and maintaining memory coherency 有权
    用于排序写入操作和维持内存一致性的计算机系统实现系统和方法

    公开(公告)号:US06529999B1

    公开(公告)日:2003-03-04

    申请号:US09428642

    申请日:1999-10-27

    IPC分类号: G06F1200

    CPC分类号: G06F12/0813 G06F12/0831

    摘要: A computer system is presented implementing a system and method for properly ordering write operations. The system and method for properly ordering write operations aids in maintaining memory coherency within the computer system. The computer system includes multiple interconnected processing nodes. One or more of the processing nodes includes a central processing unit (CPU) and/or a cache memory, and one or more of the processing nodes includes a memory controller coupled to a memory. The CPU or cache generates a write command to store data within the memory. The memory controller receives the write command and responds to the write command by issuing a target done response to the CPU or cache after the memory controller: (i) properly orders the write command within the memory controller with respect to other commands pending within the memory controller, and (ii) determines that a coherency state with respect to the write command has been established within the computer system.

    摘要翻译: 提出了一种实现用于正确排序写入操作的系统和方法的计算机系统。 用于正确排序写入操作的系统和方法有助于维持计算机系统内的内存一致性。 计算机系统包括多个互连的处理节点。 一个或多个处理节点包括中央处理单元(CPU)和/或高速缓存存储器,并且一个或多个处理节点包括耦合到存储器的存储器控​​制器。 CPU或缓存生成写入命令以将数据存储在存储器中。 存储器控制器接收写入命令并通过在存储器控制器之后向CPU或高速缓冲存储器发出目标完成响应来响应写入命令:(i)相对于存储器内的其他命令正确地命令存储器控制器内的写入命令 控制器,以及(ii)确定在计算机系统内已经建立了相对于写命令的一致性状态。

    Snoop resynchronization mechanism to preserve read ordering
    40.
    发明授权
    Snoop resynchronization mechanism to preserve read ordering 有权
    Snoop重新同步机制来保护读取顺序

    公开(公告)号:US06473837B1

    公开(公告)日:2002-10-29

    申请号:US09314036

    申请日:1999-05-18

    IPC分类号: G06F1200

    摘要: A processor employing a post-cache (LS2) buffer. Loads are stored into the LS2 buffer after probing the data cache. The load/store unit snoops the loads in the LS2 buffer against snoop requests received from an external bus. If a snoop invalidate request hits a load within the LS2 buffer and that load hit in the data cache during its initial probe, the load/store unit scans the LS2 buffer for older loads which are misses. If older load misses are detected, a synchronization indication is set for the load misses. Subsequently, one of the load misses completes and the load/store unit transmits a synchronization signal with the status for the load miss. The processor synchronizes to the instruction corresponding to the load miss, thereby discarding load hit which was subsequently snoop hit. The discarding instructions are refetched and reexecuted, thereby causing the load hit to reexecute subsequent to an earlier load miss. Load hits may generally proceed ahead of load misses and strong memory ordering rules may still be enforced.

    摘要翻译: 采用后缓存(LS2)缓冲器的处理器。 探测数据缓存后,负载将被存储到LS2缓冲区中。 加载/存储单元根据从外部总线接收的窥探请求,窥探LS2缓冲区中的负载。 如果snoop invalidate请求在LS2缓冲区中受到负载,并且在初始探测期间遇到数据高速缓存中的加载,则加载/存储单元会扫描LS2缓冲区以查找丢失的较旧负载。 如果检测到较旧的加载缺失,则为加载未命中设置同步指示。 随后,其中一个加载未命中完成,并且加载/存储单元发送具有负载未命中状态的同步信号。 处理器与对应于负载未命中的指令进行同步,从而丢弃随后窥探的负载命中。 废弃指令被重新执行并重新执行,从而导致负载命中在较早的负载丢失之后重新执行。 加载匹配通常可以在加载缺失之前进行,而强大的内存排序规则仍然可以执行。