Instruction scanning unit for locating instructions via parallel
scanning of start and end byte information
    21.
    发明授权
    Instruction scanning unit for locating instructions via parallel scanning of start and end byte information 失效
    指令扫描单元,用于通过并行扫描开始和结束字节信息来定位指令

    公开(公告)号:US5852727A

    公开(公告)日:1998-12-22

    申请号:US813568

    申请日:1997-03-10

    CPC classification number: G06F9/382 G06F9/30018 G06F9/30152 G06F9/3816

    Abstract: An instruction scanning unit for a superscalar microprocessor is disclosed. The instruction scanning unit processes start, end, and functional byte information (or predecode data) associated with a plurality of contiguous instruction bytes. The processing of start byte information and end byte information is performed independently and in parallel, and the instruction scanning unit produces a plurality of scan values which identify valid instructions within the plurality of contiguous instruction bytes. Additionally, the instruction scanning unit is scaleable. Multiple instruction scanning units may be operated in parallel to process a larger plurality of contiguous instruction bytes. Furthermore, the instruction scanning unit detects error conditions in the predecode data in parallel with scanning to locate instructions. Moreover, in parallel with the error checking and scanning to locate instructions, MROM instructions are located for dispatch to an MROM unit.

    Abstract translation: 公开了一种用于超标量微处理器的指令扫描单元。 指令扫描单元处理与多个相邻指令字节相关联的开始,结束和功能字节信息(或预解码数据)。 开始字节信息和结束字节信息的处理是独立且并行执行的,并且指令扫描单元产生多个扫描值,该扫描值标识多个连续指令字节内的有效指令。 另外,指示扫描单元是可扩展的。 可以并行操作多个指令扫描单元以处理较大的多个相邻指令字节。 此外,指令扫描单元与扫描并行地检测预解码数据中的错误状况以定位指令。 此外,与错误检查和扫描并行定位指令,MROM指令位于调度到MROM单元。

    Shared branch prediction structure
    22.
    发明授权
    Shared branch prediction structure 失效
    共享分支预测结构

    公开(公告)号:US5794028A

    公开(公告)日:1998-08-11

    申请号:US731765

    申请日:1996-10-17

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/3806 G06F9/3844

    Abstract: A shared branch prediction mechanism is provided in which a pool of branch prediction storage locations are shared among the multiple cache lines comprising a row of the instruction cache. The branch prediction storage locations within the pool are dynamically redistributed among the cache lines according to the number of branch instructions within each cache line. A cache line having a large number of branch instructions may be allocated more branch prediction storage locations than a cache line having fewer branch instructions. A prediction selector is included for each cache line in the instruction cache. The prediction selector indicates the selection of one or more branch prediction storage locations which store branch predictions corresponding to the cache line. In one embodiment, the prediction selector comprises multiple branch selectors. One branch selector is associated with each byte in the cache line, and identifies the branch prediction storage location storing the relevant branch prediction for that byte. In another embodiment, each set of two bytes within a cache line shares a portion of the pool with the corresponding set of two bytes from the other cache lines within the pool. The prediction selector for the cache line indicates which sections of the cache line have associated branch prediction storage locations allocated to them, as well as a taken/not-taken prediction associated therewith. The first taken prediction within the line subsequent to the offset indicated by the fetch address is the branch prediction selected.

    Abstract translation: 提供一种共享分支预测机制,其中在包括指令高速缓存行的多个高速缓存行之间共享分支预测存储位置池。 池内的分支预测存储位置根据每个高速缓存线内的分支指令的数量在高速缓存线之间动态地重新分配。 具有大量分支指令的高速缓存行可以分配比具有较少分支指令的高速缓存行更多的分支预测存储位置。 对于指令高速缓存中的每个高速缓存线,包括预测选择器。 预测选择器指示存储与高速缓存行对应的分支预测的一个或多个分支预测存储位置的选择。 在一个实施例中,预测选择器包括多个分支选择器。 一个分支选择器与高速缓存行中的每个字节相关联,并且识别存储该字节的相关分支预测的分支预测存储位置。 在另一个实施例中,高速缓存行内的两个字节的每组都与池内的其他高速缓存行中的两个字节的对应集合共享该池的一部分。 高速缓存行的预测选择器指示高速缓存行的哪些部分具有分配给它们的相关联的分支预测存储位置,以及与之相关联的采取/未获取的预测。 在由取出地址指示的偏移之后的行中的第一次取得的预测是所选择的分支预测。

    Superscalar microprocessor employing a way prediction unit to predict
the way of an instruction fetch address and to concurrently provide a
branch prediction address corresponding to the fetch address
    23.
    发明授权
    Superscalar microprocessor employing a way prediction unit to predict the way of an instruction fetch address and to concurrently provide a branch prediction address corresponding to the fetch address 失效
    超标量微处理器采用方式预测单元来预测指令获取地址的方式并且同时提供对应于获取地址的分支预测地址

    公开(公告)号:US5764946A

    公开(公告)日:1998-06-09

    申请号:US826884

    申请日:1997-04-08

    Abstract: A superscalar microprocessor is provided employing a way prediction unit which predicts the next fetch address as well as the way of the instruction cache that the current fetch address hits in while the instructions associated with the current fetch are being read from the instruction cache. The microprocessor may achieve high frequency operation while using an associative instruction cache. An instruction fetch can be made every clock cycle using the predicted fetch address from the way prediction unit until an incorrect next fetch address or an incorrect way is predicted. The instructions from the predicted way are provided to the instruction processing pipelines of the superscalar microprocessor each clock cycle.

    Abstract translation: 提供超标量微处理器,其使用预测下一个提取地址的方式预测单元以及当前提取地址所在的指令高速缓存的方式,同时从指令高速缓存读取与当前提取相关联的指令。 微处理器可以在使用关联指令高速缓存时实现高频操作。 每个时钟周期可以使用来自方式预测单元的预测提取地址进行指令提取,直到预测到不正确的下一个提取地址或错误的方式。 来自预测方式的指令被提供给超标量微处理器每个时钟周期的指令处理流水线。

    Contention handling apparatus for generating user busy signal by
logically summing wait output of next higher priority user and access
requests of higher priority users
    24.
    发明授权
    Contention handling apparatus for generating user busy signal by logically summing wait output of next higher priority user and access requests of higher priority users 失效
    竞争处理装置,用于通过对下一较高优先级用户的等待输出和较高优先级用户的访问请求进行逻辑求和来产生用户忙信号

    公开(公告)号:US5301330A

    公开(公告)日:1994-04-05

    申请号:US596549

    申请日:1990-10-12

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F13/368

    Abstract: Contention handling apparatus which receives access request signals from a number of users and processes these requests to allow controlled access to a shared resource. The contention handling apparatus includes a number of access blocks, with one of the access blocks being associated with each user. A busy line of each of the access blocks is connected to receive a busy signal; the busy signal being an access request signal from a higher priority user, thereby indicating that the shared resource is unavailable. Each access block receiving a busy signal, latches the corresponding access request signal until the busy signal is deasserted. If the busy signal and the access request signal occur at the same time, the corresponding access block generates a wait output signal. The logical sum of the wait output of an access block associated with a next higher priority user and the access request signals of all the higher priority users serves as the busy signal for one of the access blocks.

    Abstract translation: 竞争处理装置,其从多个用户接收访问请求信号并处理这些请求以允许对共享资源的受控访问。 争用处理装置包括多个访问块,其中一个访问块与每个用户相关联。 每个接入块的忙线被连接以接收忙信号; 忙信号是来自较高优先权用户的访问请求信号,从而指示共享资源不可用。 每个访问块接收忙信号,锁存相应的访问请求信号,直到忙信号被无效。 如果忙信号和访问请求信号同时发生,则对应的访问块产生等待输出信号。 与下一较高优先级用户相关联的接入块的等待输出和所有较高优先级用户的接入请求信号的逻辑和用作其中一个接入块的忙信号。

    Method and apparatus for reducing critical speed path delays
    25.
    发明授权
    Method and apparatus for reducing critical speed path delays 失效
    减少临界速度路径延迟的方法和装置

    公开(公告)号:US4940908A

    公开(公告)日:1990-07-10

    申请号:US343623

    申请日:1989-04-27

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: H03K19/01707

    Abstract: A method and apparatus is disclosed for reducing the propagation delay associated with the critical speed path of a binary logic circuit by using "multiplexing logic". More specifically, the inputs to the logic circuit are defined as either critical or non-critical inputs and the product terms are manipulated so that the non-critical inputs are mutually exclusive. The non-critical inputs are supplied to one or more first logic gate structures wherein the ultimate outputs of the first logic gate structures control multiplexer couplers. The critical speed inputs are supplied to one or more second logic gate structures wherein the ultimate outputs of the second logic gate structures are provided as the input to the multiplexer couplers.

    Techniques for utilizing translation lookaside buffer entry numbers to improve processor performance
    26.
    发明授权
    Techniques for utilizing translation lookaside buffer entry numbers to improve processor performance 有权
    使用翻译后备缓冲区入口号来提高处理器性能的技术

    公开(公告)号:US08984254B2

    公开(公告)日:2015-03-17

    申请号:US13630346

    申请日:2012-09-28

    CPC classification number: G06F12/1027 Y02D10/13

    Abstract: A technique for operating a processor includes translating, using an associated translation lookaside buffer, a first virtual address into a first physical address through a first entry number in the translation lookaside buffer. The technique also includes translating, using the translation lookaside buffer, a second virtual address into a second physical address through a second entry number in the translation lookaside buffer. The technique further includes, in response to the first entry number being the same as the second entry number, determining that the first and second virtual addresses point to the same physical address in memory and reference the same data.

    Abstract translation: 一种用于操作处理器的技术包括:通过翻译后备缓冲器中的第一入口号,使用关联的转换后备缓冲器将第一虚拟地址翻译成第一物理地址。 该技术还包括通过翻译后备缓冲器中的第二入口号将翻译后备缓冲器翻译成第二虚拟地址转换为第二物理地址。 该技术还包括响应于第一入口号与第二入口号相同,确定第一和第二虚拟地址指向存储器中相同的物理地址并引用相同的数据。

    Data processing system operable in single and multi-thread modes and having multiple caches and method of operation
    27.
    发明授权
    Data processing system operable in single and multi-thread modes and having multiple caches and method of operation 有权
    数据处理系统可在单线程和多线程模式下运行,并具有多个高速缓存和操作方法

    公开(公告)号:US08966232B2

    公开(公告)日:2015-02-24

    申请号:US13370420

    申请日:2012-02-10

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: In some embodiments, a data processing system includes a processing unit, a first load/store unit LSU and a second LSU configured to operate independently of the first LSU in single and multi-thread modes. A first store buffer is coupled to the first and second LSUs, and a second store buffer is coupled to the first and second LSUs. The first store buffer is used to execute a first thread in multi-thread mode. The second store buffer is used to execute a second thread in multi-thread mode. The first and second store buffers are used when executing a single thread in single thread mode.

    Abstract translation: 在一些实施例中,数据处理系统包括处理单元,第一加载/存储单元LSU和被配置为以单线程和多线程模式独立于第一LSU操作的第二LSU。 第一存储缓冲器耦合到第一和第二LSU,并且第二存储缓冲器耦合到第一和第二LSU。 第一个存储缓冲区用于在多线程模式下执行第一个线程。 第二个存储缓冲区用于在多线程模式下执行第二个线程。 当在单线程模式下执行单个线程时,使用第一个和第二个存储缓冲区。

    APPARATUS AND METHOD FOR DYNAMIC ALLOCATION OF EXECUTION QUEUES
    28.
    发明申请
    APPARATUS AND METHOD FOR DYNAMIC ALLOCATION OF EXECUTION QUEUES 审中-公开
    动作队伍动态分配的装置及方法

    公开(公告)号:US20130297912A1

    公开(公告)日:2013-11-07

    申请号:US13462993

    申请日:2012-05-03

    CPC classification number: G06F9/3836 G06F9/3814 G06F9/3822 G06F9/3885

    Abstract: A processor reduces the likelihood of stalls at an instruction pipeline by dynamically extending the size of a full execution queue. To extend the full execution queue, the processor temporarily repurposes another execution queue to store instructions on behalf of the full execution queue. The execution queue to be repurposed can be selected based on a number of factors, including the type of instructions it is generally designated to store, whether it is empty of other instruction types, and the rate of cache hits at the processor. By selecting the repurposed queue based on dynamic factors such as the cache hit rate, the likelihood of stalls at the dispatch stage is reduced for different types of program flows, improving overall efficiency of the processor.

    Abstract translation: 处理器通过动态扩展完整执行队列的大小来减少指令流水线停顿的可能性。 为了扩展完整执行队列,处理器暂时重新使用另一个执行队列来代表完整执行队列存储指令。 可以基于许多因素来选择要重新利用的执行队列,包括其通常指定存储的指令的类型,其是否为空其他指令类型,以及处理器处的高速缓存命中率。 通过基于诸如高速缓存命中率的动态因素来选择重新使用的队列,针对不同类型的节目流,调度阶段的停顿的可能性降低,从而提高了处理器的整体效率。

    System and method for power efficient memory caching
    29.
    发明授权
    System and method for power efficient memory caching 有权
    高效率内存缓存的系统和方法

    公开(公告)号:US07330936B2

    公开(公告)日:2008-02-12

    申请号:US11109163

    申请日:2005-04-19

    CPC classification number: G06F12/0864 G06F12/1054 G06F2212/1028 Y02D10/13

    Abstract: A system and method for power efficient memory caching. Some illustrative embodiments may include a system comprising: a hash address generator coupled to an address bus (the hash address generator converts a bus address present on the address bus into a current hashed address); a cache memory coupled to the address bus (the cache memory comprises a tag stored in one of a plurality of tag cache ways and data stored in one of a plurality of data cache ways); and a hash memory coupled to the address bus (the hash memory comprises a saved hashed address, the saved hashed address associated with the data and the tag). Less than all of the plurality of tag cache ways are enabled when the current hashed address matches the saved hashed addresses. An enabled tag cache way comprises the tag.

    Abstract translation: 一种用于高效内存缓存的系统和方法。 一些说明性实施例可以包括:系统,其包括:耦合到地址总线的散列地址发生器(所述散列地址生成器将存在于所述地址总线上的总线地址转换为当前散列的地址); 耦合到所述地址总线的高速缓存存储器(所述高速缓冲存储器包括存储在多个标签高速缓存路径中的一个中的标签和存储在多个数据高速缓存路径之一中的数据); 以及耦合到地址总线的散列存储器(散列存储器包括保存的散列地址,与数据和标签相关联的保存的散列地址)。 当当前散列的地址与保存的散列地址匹配时,小于所有多个标签高速缓存方式被启用。 启用的标签缓存方式包括标签。

    Cache holding register for delayed update of a cache line into an
instruction cache
    30.
    发明授权
    Cache holding register for delayed update of a cache line into an instruction cache 失效
    缓存保持寄存器用于将高速缓存行的延迟更新延迟到指令高速缓存

    公开(公告)号:US6076146A

    公开(公告)日:2000-06-13

    申请号:US310356

    申请日:1999-05-12

    Abstract: An instruction cache employing a cache holding register is provided. When a cache line of instruction bytes is fetched from main memory, the instruction bytes are temporarily stored into the cache holding register as they are received from main memory. The instruction bytes are predecoded as they are received from the main memory. If a predicted-taken branch instruction is encountered, the instruction fetch mechanism within the instruction cache begins fetching instructions from the target instruction path. This fetching may be initiated prior to receiving the complete cache line containing the predicted-taken branch instruction. As long as instruction fetches from the target instruction path continue to hit in the instruction cache, these instructions may be fetched and dispatched into a microprocessor employing the instruction cache. The remaining portion of the cache line of instruction bytes containing the predicted-taken branch instruction is received by the cache holding register. In order to reduce the number of ports employed upon the instruction bytes storage used to store cache lines of instructions, the cache holding register retains the cache line until an idle cycle occurs in the instruction bytes storage. The same port ordinarily used for fetching instructions is then used to store the cache line into the instruction bytes storage. In one embodiment, the instruction cache prefetches a succeeding cache line to the cache line which misses. A second cache holding register is employed for storing the prefetched cache line.

    Abstract translation: 提供采用高速缓存保持寄存器的指令高速缓存器。 当从主存储器取出指令字节的高速缓存行时,指令字节从主存储器接收时临时存储到高速缓存保持寄存器中。 指令字节是从主存储器接收到的预解码的。 如果遇到预测的分支指令,则指令高速缓存内的指令获取机制开始从目标指令路径获取指令。 可以在接收到包含预测的分支指令的完整高速缓存行之前启动该获取。 只要从目标指令路径获取的指令继续命中指令高速缓存,可以将这些指令提取并分派到采用指令高速缓存的微处理器中。 由高速缓存保持寄存器接收包含预测的分支指令的指令字节的高速缓存行的剩余部分。 为了减少用于存储高速缓存行指令的指令字节存储器所使用的端口数量,高速缓存保持寄存器保持高速缓存行直到在指令字节存储器中发生空闲周期。 通常用于提取指令的相同端口用于将高速缓存行存储到指令字节存储器中。 在一个实施例中,指令高速缓存将后续的高速缓存行预取到丢失的高速缓存行。 采用第二高速缓存保存寄存器来存储预取的高速缓存行。

Patent Agency Ranking