Fusing load and alu operations
    2.
    发明授权
    Fusing load and alu operations 有权
    定影负载和alu操作

    公开(公告)号:US07398372B2

    公开(公告)日:2008-07-08

    申请号:US10180391

    申请日:2002-06-25

    IPC分类号: G06F9/30 G06F9/40 G06F15/00

    CPC分类号: G06F9/3017 G06F9/3853

    摘要: Fusing a load micro-operation (uop) together with an arithmetic uop. Intra-instruction fusing can increase cache memory storage efficiency and computer instruction processing bandwidth within a microprocessor without incurring significant computer system cost. Uops are fused, stored in a cache memory, un-fused, executed in parallel, and retired in order to optimized cost and performance.

    摘要翻译: 将负载微操作(uop)与算术空间融合。 内部指令融合可以在微处理器内提高高速缓冲存储器存储效率和计算机指令处理带宽,而不会导致重大的计算机系统成本。 Uops被融合,存储在缓存中,未融合,并行执行并退出,以优化成本和性能。

    System and method for writing back multiple results over a single-result bus and processor employing the same
    3.
    发明授权
    System and method for writing back multiple results over a single-result bus and processor employing the same 有权
    通过单结果总线和采用该结果的处理器写回多个结果的系统和方法

    公开(公告)号:US06275926B1

    公开(公告)日:2001-08-14

    申请号:US09285609

    申请日:1999-04-02

    申请人: Nicholas G. Samra

    发明人: Nicholas G. Samra

    IPC分类号: G06F1204

    摘要: For use in a processor having a result bus of insufficient width to convey all results of a given multiple-result instruction concurrently, a system for, and method of, writing back the results of the multiple-result instruction. In one embodiment, the system includes: (1) multi-result node creation circuitry that creates a multi-result node having at least first and second results for the multiple-result instruction and (2) node transmission circuitry, coupled to the multi-result node creation circuitry, that transmits the first and second results of said multi-result node sequentially over the result bus.

    摘要翻译: 用于具有不足宽度的结果总线的处理器用于同时传送给定多结果指令的所有结果,用于写回多结果指令的结果的系统和方法。 在一个实施例中,系统包括:(1)多结果节点创建电路,其创建具有用于多结果指令的至少第一和第二结果的多结果节点和(2)耦合到多结果指令的节点传输电路, 结果节点创建电路,其在结果总线上顺序地发送所述多结果节点的第一和第二结果。

    Content addressable memory system
    4.
    发明授权
    Content addressable memory system 失效
    内容可寻址内存系统

    公开(公告)号:US5646878A

    公开(公告)日:1997-07-08

    申请号:US460353

    申请日:1995-06-02

    申请人: Nicholas G. Samra

    发明人: Nicholas G. Samra

    IPC分类号: G11C15/04 G11C15/00

    CPC分类号: G11C15/04

    摘要: A CAM system (2) stores a plurality of data sets in a plurality of pairs of CAM cells (4) and RAM cells (6). The portion of a particular data set stored in one of the RAM cells is accessed by inputting a tag to CAM cells that matches the portion of the data set stored in the CAM cell associated with the particular RAM cell. CAM system incorporates a novel two-stage matchline re-coding scheme to improve performance. Each of a plurality of first stage circuits (10) receives a plurality of matchline signals from a plurality of CAM sets and a plurality of data inputs from the corresponding RAM sets. Each output of the first stage circuits is further processed by a second stage circuit (12) which generates the final data output. The CAM system avoids the use of self-timed control signals and sense amplifiers.

    摘要翻译: CAM系统(2)将多个数据组存储在多对CAM单元(4)和RAM单元(6)中。 通过向与存储在与特定RAM单元相关联的CAM单元中的数据集的部分匹配的CAM单元输入标签来访问存储在RAM单元之一中的特定数据集的部分。 CAM系统结合了一种新颖的两阶段匹配线重编码方案来提高性能。 多个第一级电路(10)中的每一个从多个CAM组接收多个匹配线信号,并从相应的RAM组接收多个数据输入。 第一级电路的每个输出由产生最终数据输出的第二级电路(12)进一步处理。 CAM系统避免使用自定时控制信号和感测放大器。

    System for forming a critical update loop to continuously reload active thread state from a register storing thread state until another active thread is detected
    5.
    发明授权
    System for forming a critical update loop to continuously reload active thread state from a register storing thread state until another active thread is detected 有权
    用于形成关键更新循环的系统,以从存储线程状态的寄存器连续重新加载活动线程状态,直到检测到另一个活动线程

    公开(公告)号:US07653904B2

    公开(公告)日:2010-01-26

    申请号:US10672150

    申请日:2003-09-26

    申请人: Nicholas G. Samra

    发明人: Nicholas G. Samra

    IPC分类号: G06F9/46 G06F9/00

    CPC分类号: G06F9/30101 G06F9/3851

    摘要: A method, apparatus, and system are provided for a multi-threaded virtual state mechanism. According to one embodiment, active thread state of a first active thread is received using a virtual state mechanism, and virtual thread state is generated in accordance with the active thread state of the first active thread, and the virtual thread state corresponding to the first active thread is forwarded to state update logic.

    摘要翻译: 为多线程虚拟状态机制提供了一种方法,装置和系统。 根据一个实施例,使用虚拟状态机制接收第一活动线程的主动线程状态,并且根据第一活动线程的活动线程状态生成虚线线程状态,并且与第一活动线程相对应的虚线线状态 线程被转发到状态更新逻辑。

    Intra-instruction fusion
    6.
    发明授权

    公开(公告)号:US07051190B2

    公开(公告)日:2006-05-23

    申请号:US10180387

    申请日:2002-06-25

    IPC分类号: G06F9/12

    CPC分类号: G06F9/3017 G06F9/3853

    摘要: Fusing micro-operations (uops) together. Intra-instruction fusing can increase cache memory storage efficiency and computer instruction processing bandwidth within a microprocessor without incurring significant computer system cost. Uops are fused, stored in cache memory, un-fused, executed in parallel, and retired in order to optimize cost and performance.

    System and method for instruction cache re-ordering

    公开(公告)号:US06519683B2

    公开(公告)日:2003-02-11

    申请号:US09752414

    申请日:2000-12-29

    IPC分类号: G06F1200

    摘要: The present invention is directed to a system and method for implementing a re-ordered instruction cache. In one embodiment, groups or “packets” of instructions with specific packet sizes are formed. Each of packets includes two or more positions. The two or more positions are defined such that they support one or more different types of instructions. Each of the positions are also correlated to a subset of the specialized execution units of the processor. Given a specific packet size and definitions for each of the positions, each of the instructions are re-ordered according to instruction type and loaded into the instruction cache in the new order.

    Method and apparatus for processing multiple cache misses using reload
folding and store merging
    8.
    发明授权
    Method and apparatus for processing multiple cache misses using reload folding and store merging 失效
    使用重载折叠和存储合并处理多个高速缓存未命中的方法和装置

    公开(公告)号:US5809530A

    公开(公告)日:1998-09-15

    申请号:US558071

    申请日:1995-11-13

    IPC分类号: G06F12/08 G06F13/00

    CPC分类号: G06F12/0897 G06F12/0859

    摘要: A data processor (40) keeps track of misses to a cache (71) so that multiple misses within the same cache line can be merged or folded at reload time. A load/store unit (60) includes a completed store queue (61) for presenting store requests to the cache (71) in order. If a store request misses in the cache (71), the completed store queue (61) requests the cache line from a lower-level memory system (90) and thereafter inactivates the store request. When a reload cache line is received, the completed store queue (61) compares the reload address to all entries. If at least one address matches the reload address, one entry's data is merged with the cache line prior to storage in the cache (71). Other matching entries become active and are allowed to reaccess the cache (71). A miss queue (80) coupled between the load/store unit (60) and the lower-level memory system (90) implements reload folding to improve efficiency.

    摘要翻译: 数据处理器(40)跟踪高速缓存(71)的未命中,使得同一高速缓存行内的多个未命中可以在重新加载时被合并或折叠。 加载/存储单元(60)包括用于向高速缓存(71)依次呈现存储请求的完成的存储队列(61)。 如果存储请求在高速缓存(71)中丢失,则完成的存储队列(61)从下级存储器系统(90)请求高速缓存行,然后使存储请求失效。 当接收到重新加载高速缓存行时,完成的存储队列(61)将重新加载地址与所有条目进行比较。 如果至少一个地址与重新加载地址匹配,则一个条目的数据在高速缓存存储之前与高速缓存行合并(71)。 其他匹配条目变为活动状态,并允许其重新访问高速缓存(71)。 耦合在加载/存储单元(60)和下层存储器系统(90)之间的缺失队列(80)实现重载折叠以提高效率。

    Virtual multithreading translation mechanism including retrofit capability
    9.
    发明授权
    Virtual multithreading translation mechanism including retrofit capability 失效
    虚拟多线程翻译机制,包括改进能力

    公开(公告)号:US07669203B2

    公开(公告)日:2010-02-23

    申请号:US10741914

    申请日:2003-12-19

    IPC分类号: G06F9/46 G06F9/44

    CPC分类号: G06F9/3885 G06F9/3851

    摘要: Method, apparatus and system embodiments provide support for multiple SoEMT software threads on multiple SMT logical thread contexts. A thread translation table maintains physical-to-virtual thread translation information in order to provide such information to structures within a processor that utilize virtual thread information. By associating a thread translation table with such structures, a processor that supports simultaneous multithreading (SMT) may be easily retrofitted to support switch-on-event multithreading on the SMT logical processors.

    摘要翻译: 方法,装置和系统实施例提供对多个SMT逻辑线程上下文上的多个SoEMT软件线程的支持。 线程转换表维护物理到虚拟线程转换信息,以便向利用虚拟线程信息的处理器内的结构提供这样的信息。 通过将线程转换表与这种结构相关联,支持同时多线程(SMT)的处理器可以轻松地被改进以支持SMT逻辑处理器上的事件开启事务多线程。

    Method and apparatus for fast dependency coordinate matching
    10.
    发明授权
    Method and apparatus for fast dependency coordinate matching 失效
    快速依赖性坐标匹配的方法和装置

    公开(公告)号:US06889314B2

    公开(公告)日:2005-05-03

    申请号:US09965211

    申请日:2001-09-26

    IPC分类号: G06F9/38 G06F9/52

    摘要: Disclosed herein is a method for matching dependency coordinates and an efficient apparatus for performing the dependency coordinate matching very quickly. A plurality of buffers to store instructions is set forth. Each storage location of a buffer corresponds to a particular pair of dependency coordinates. Dependency matching logic receives the dependency coordinates for a buffered instruction and scheduling information pertaining to dispatched instructions. The dependency matching logic indicates whether a dependency precludes scheduling of the corresponding buffered instruction. Dependency checking logic produces a ready signal for the buffered instruction when no such dependency is indicated by the dependency matching logic.

    摘要翻译: 这里公开了一种用于匹配依赖性坐标的方法和用于非常快速地执行依赖性坐标匹配的有效装置。 阐述存储指令的多个缓冲器。 缓冲器的每个存储位置对应于一对特定的依赖性坐标。 依赖性匹配逻辑接收缓冲指令的依赖性坐标和与调度指令有关的调度信息。 依赖性匹配逻辑指示依赖性是否排除对相应缓存指令的调度。 当依赖性匹配逻辑不指示这种依赖性时,依赖性检查逻辑为缓冲指令产生就绪信号。