Providing global translations with address space numbers
    1.
    发明授权
    Providing global translations with address space numbers 有权
    提供地址空间编号的全球翻译

    公开(公告)号:US06604187B1

    公开(公告)日:2003-08-05

    申请号:US09596636

    申请日:2000-06-19

    IPC分类号: G06F1208

    摘要: A processor provides a register for storing an address space number (ASN). Operating system software may assign different ASNs to different processes. The processor may include a TLB to cache translations, and the TLB may record the ASN from the ASN register in a TLB entry being loaded. Thus, translations may be associated with processes through the ASNs. Generally, a TLB hit will be detected in an entry if the virtual address to be translated matches the virtual address tag and the ASN matches the ASN stored in the register. Additionally, the processor may use an indication from the translation table entries to indicate whether or not a translation is global. If a translation is global, then the ASN comparison is not included in detecting a hit in the TLB. Thus, translations which are used by more than one process may not occupy multiple TLB entries. Instead, a hit may be detected on the TLB entry storing the global translation even though the recorded ASN may not match the current ASN. In one embodiment, if ASNs are disabled, the TLB may be flushed on context switches. However, the indication from the translation table entries used to indicate that the translation is global may be used (when ASNs are disabled) by the TLB to selectively invalidate non-global translations on a context switch while not invalidating global translations.

    摘要翻译: 处理器提供用于存储地址空间号(ASN)的寄存器。 操作系统软件可以将不同的ASN分配给不同的进程。 处理器可以包括用于高速缓存转换的TLB,并且TLB可以在正在加载的TLB条目中从ASN寄存器记录ASN。 因此,翻译可以通过ASN与进程相关联。 通常,如果要转换的虚拟地址与虚拟地址标签匹配并且ASN与存储在寄存器中的ASN相匹配,则将在条目中检测到TLB命中。 另外,处理器可以使用来自转换表条目的指示来指示翻译是否是全局的。 如果翻译是全局的,则在检测到TLB中的命中时不包括ASN比较。 因此,由多个进程使用的转换可能不占用多个TLB条目。 相反,即使记录的ASN可能与当前ASN不匹配,也可以在存储全局转换的TLB条目上检测到命中。 在一个实施例中,如果ASN被禁用,则可以在上下文切换上刷新TLB。 然而,可以使用用于指示翻译是全局的翻译表条目的指示(当ASN被禁用时)由TLB选择性地使上下文切换上的非全局翻译无效,而不会使全局翻译无效。

    Store queue multimatch detection
    2.
    发明授权
    Store queue multimatch detection 有权
    存储队列多重检测

    公开(公告)号:US06523109B1

    公开(公告)日:2003-02-18

    申请号:US09433189

    申请日:1999-10-25

    申请人: Stephan G. Meier

    发明人: Stephan G. Meier

    IPC分类号: G06F944

    摘要: A processor includes a store queue configured to detect a hit on a store queue entry for a load being executed by the processor, and to forward data from the store queue entry to provide a result for the load. The store queue data is provided to the data cache, along with an indication of how much data is being provided (e.g. byte enables). The data cache may then fill in any additional data accessed by the load from cache data, and provide a load result. Additionally, the store queue is configured to detect if more than one store queue entry is hit (i.e. that more than one store within the store queue updates at least one byte accessed by the load), referred to as a multimatch. If a multimatch is detected, the store queue retries the load. Subsequently, the load may be reexecuted and may not multimatch (as entries are deleted upon completion of the corresponding stores). The load may complete when it does not multimatch. In one embodiment, the store queue independently detects hits on the upper and lower portions of each store queue entry (e.g. doubleword portions) and forwards from the upper and lower portions independently. Thus, a load may hit one store queue entry for the lower portion of the data accessed by the load and a different store queue entry for the upper portion of the data accessed by the load without multimatch detection.

    摘要翻译: 处理器包括存储队列,其被配置为检测由处理器执行的负载的存储队列条目的命中,以及从存储队列条目转发数据以提供负载的结果。 存储队列数据被提供给数据高速缓存,以及提供多少数据的指示(例如,字节使能)。 然后,数据高速缓存可以填充来自高速缓存数据的负载访问的任何附加数据,并提供负载结果。 此外,存储队列被配置为检测是否命中多于一个存储队列条目(即,存储队列内的多于一个存储器更新由负载访问的至少一个字节),被称为多映象。 如果检测到多重检测,则存储队列将重试加载。 随后,可以重新执行加载,并且可能不会进行多重映射(当对应的存储完成时,条目被删除)。 负载可能在不进行多重测量时完成。 在一个实施例中,存储队列独立地检测每个存储队列条目的上部和下部的命中(例如双字部分),并独立地从上部和下部前进。 因此,负载可以针对由负载访问的数据的较低部分命中一个存储队列条目,以及针对由负载访问的数据的上部的不同的存储队列条目,而不进行多重检测。

    Store queue number assignment and tracking
    3.
    发明授权
    Store queue number assignment and tracking 有权
    存储队列号分配和跟踪

    公开(公告)号:US06481251B1

    公开(公告)日:2002-11-19

    申请号:US09433184

    申请日:1999-10-25

    IPC分类号: G06F300

    摘要: A processor includes a store queue and a store queue number assignment circuit. The store queue number assignment circuit assigns store queue numbers to stores, and operates upon instruction operations prior to the instruction operations reaching a point in the pipeline of the processor at which out of order instruction processing begins. Thus, store queue entries may be reserved for stores according to the program order of the stores. Additionally, in one embodiment, the store queue number identifying the youngest store represented in the store queue may be assigned to loads. In this manner, loads may determine which stores in the store queue are older or younger than the load based on relative position within the store queue. Checking for store queue hits may be qualified with the entries between the head of the store queue and the entry indicated by the load's store queue number. In one particular embodiment, the store queue number may include an additional “toggle” bit which is toggled each time the assignment of store queue numbers reaches the maximum store queue entry and wraps to zero. If the toggle bit of the store in the store queue entry identified by the load's store queue number differs from the toggle bit of the load's store queue number, than the store queue entry has been reassigned to a store younger than the load.

    摘要翻译: 处理器包括存储队列和存储队列号分配电路。 存储队列号分配电路分配存储队列号以存储,并且在指令操作到达处理器的流水线之点的指令操作之前进行操作,在该处理器的流水线处,开始无序指令处理。 因此,存储队列条目可以根据商店的程序顺序保留用于商店。 另外,在一个实施例中,识别存储队列中表示的最小存储的存储队列号可被分配给负载。 以这种方式,负载可以基于存储队列内的相对位置来确定存储队列中的哪些存储器比负载更老或更小。 可以使用存储队列的头部和负载的存储队列号指示的条目之间的条目来限定检查存储队列命中。 在一个特定实施例中,存储队列号可以包括在每次存储队列号的分配达到最大存储队列条目并且转换为零时切换的附加“切换”位。 如果由加载存储队列号识别的存储队列条目中的存储的切换位与加载存储队列号的切换位不同,则存储队列条目已经重新分配给小于加载的存储。

    Rapid execution of floating point load control word instructions
    4.
    发明授权
    Rapid execution of floating point load control word instructions 有权
    快速执行浮点负载控制字指令

    公开(公告)号:US06405305B1

    公开(公告)日:2002-06-11

    申请号:US09394024

    申请日:1999-09-10

    IPC分类号: G06F9302

    摘要: A microprocessor with a floating point unit configured to rapidly execute floating point load control word (FLDCW) type instructions in an out of program order context is disclosed. The floating point unit is configured to schedule instructions older than the FLDCW-type instruction before the FLDCW-type instruction is scheduled. The FLDCW-type instruction acts as a barrier to prevent instructions occurring after the FLDCW-type instruction in program order from executing before the FLDCW-type instruction. Indicator bits may be used to simplify instruction scheduling, and copies of the floating point control word may be stored for instruction that have long execution cycles. A method and computer configured to rapidly execute FLDCW-type instructions in an out of program order context are also disclosed.

    摘要翻译: 具有浮点单元的微处理器被配置为在程序顺序上下文中快速执行浮点负载控制字(FLDCW)类型指令。 浮点单元被配置为在调度FLDCW类型指令之前调度比FLDCW类型指令更早的指令。 FLDCW型指令作为屏障,以防止在FLDCW类型指令之前执行FLDCW类型指令之后的程序顺序发生的指令。 指示符位可以用于简化指令调度,并且可以存储具有长执行周期的指令的浮点控制字的副本。 还公开了一种配置成在程序顺序上下文中快速执行FLDCW型指令的方法和计算机。

    Dynamic memory allocation suitable for stride-based prefetching
    5.
    发明授权
    Dynamic memory allocation suitable for stride-based prefetching 失效
    动态内存分配适合基于步幅的预取

    公开(公告)号:US6076151A

    公开(公告)日:2000-06-13

    申请号:US948947

    申请日:1997-10-10

    申请人: Stephan G. Meier

    发明人: Stephan G. Meier

    IPC分类号: G06F9/38 G06F12/08 G06F17/30

    摘要: A dynamic memory allocation routine maintains an allocation size cache which records the address of a most recently allocated memory block for each different size of memory block that has been allocated. Upon receiving a dynamic memory allocation request, the dynamic memory allocation routine determines if the requested size is equal to one of the sizes recorded in the allocation size cache. If a matching size is found, the dynamic memory allocation routine attempts to allocate a memory block contiguous to the most recently allocated memory block of that matching size. If the contiguous memory block has been allocated to another memory block, the dynamic memory allocation routine attempts to reserve a reserved memory block having a size which is a predetermined multiple of the requested size. The requested memory block is then allocated at the beginning of the reserved memory block. By reserving the reserved memory block, the dynamic memory allocation routine may increase the likelihood that subsequent requests for memory blocks having the requested size can be allocated in contiguous memory locations.

    摘要翻译: 动态存储器分配程序维护分配大小高速缓存,其记录已分配的每个不同大小的存储器块的最近分配的存储块的地址。 在接收到动态存储器分配请求时,动态存储器分配例程确定所请求的大小是否等于记录在分配大小高速缓存中的尺寸之一。 如果找到匹配的大小,则动态内存分配例程尝试分配与该匹配大小最近分配的内存块相邻的内存块。 如果连续存储器块已被分配给另一个存储器块,则动态存储器分配例程尝试预留具有所请求大小的预定倍数的大小的保留存储器块。 然后,请求的存储器块在保留的存储器块的开头被分配。 通过保留保留的存储器块,动态存储器分配程序可以增加对具有所请求大小的存储器块的后续请求可以在连续存储器位置中分配的可能性。

    Apparatus and method for superforwarding load operands in a microprocessor
    6.
    发明授权
    Apparatus and method for superforwarding load operands in a microprocessor 有权
    用于在微处理器中超载负载操作数的装置和方法

    公开(公告)号:US06442677B1

    公开(公告)日:2002-08-27

    申请号:US09329497

    申请日:1999-06-10

    IPC分类号: G06F9312

    CPC分类号: G06F9/30043 G06F9/3826

    摘要: An apparatus and method for superforwarding load operands in a microprocessor are provided. An execution unit in a microprocessor is configured to receive a load instruction and a subsequent instruction. If the load instruction corresponds to a simple load instruction, a destination operand of the load instruction can be superforwarded to a subsequent instruction if the subsequent instruction specifies a source operand that depends on the destination operand of the load instruction. The subsequent instruction is not required to wait until a load instruction executes or completes and can be scheduled and/or executed prior to or at the same time as the load instruction. Consequently, latencies associated with operand dependencies may be reduced.

    摘要翻译: 提供了一种用于在微处理器中超载负载操作数的装置和方法。 微处理器中的执行单元被配置为接收加载指令和后续指令。 如果加载指令对应于简单的加载指令,则如果后续指令指定依赖于加载指令的目的地操作数的源操作数,则加载指令的目标操作数可以被超前给后续指令。 后续指令不需要等待加载指令执行或完成,并且可以在加载指令之前或同时进行调度和/或执行。 因此,可以减少与操作数相关性相关联的延迟。

    Method and apparatus for rapid execution of FCOM and FSTSW
    7.
    发明授权
    Method and apparatus for rapid execution of FCOM and FSTSW 有权
    用于快速执行FCOM和FSTSW的方法和装置

    公开(公告)号:US06425074B1

    公开(公告)日:2002-07-23

    申请号:US09393524

    申请日:1999-09-10

    IPC分类号: G06F9302

    摘要: A microprocessor configured to rapidly execute floating point store status word (FSTSW) type instructions that are immediately preceded by floating point compare (FCOM) type instructions is disclosed. FCOM-type instructions are modified to store their results to an architectural floating point status word and a temporary destination register. If an FSTSW-type instruction is detected immediately following an FCOM-type instruction, then the FSTSW-type instruction is transformed into a special fast floating point store status word (FSTSWEF) instruction. Unlike the FSTSW-type instruction, which is serializing and negatively impacts performance, the FSTSWEF instruction is not serializing and allows execution to continue without undue serialization. A computer system and method for rapidly executing FSTSW instructions immediately preceded by FCOM-type instructions are also disclosed.

    摘要翻译: 公开了一种被配置为快速执行浮点比较(FCOM)类型指令之前的浮点存储状态字(FSTSW)类型指令的微处理器。 修改FCOM类型的指令以将其结果存储到架构浮点状态字和临时目标寄存器。 如果在FCOM型指令之后立即检测到FSTSW型指令,则FSTSW型指令被转换为特殊的快速浮点存储状态字(FSTSWEF)指令。 与串行化和负面影响性能的FSTSW型指令不同,FSTSWEF指令不是序列化的,允许执行继续,而不会过多的序列化。 还公开了一种用于在紧接在FCOM型指令之前快速执行FSTSW指令的计算机系统和方法。

    Data transaction typing for improved caching and prefetching
characteristics
    8.
    发明授权
    Data transaction typing for improved caching and prefetching characteristics 失效
    用于改进缓存和预取特征的数据事务输入

    公开(公告)号:US6151662A

    公开(公告)日:2000-11-21

    申请号:US982720

    申请日:1997-12-02

    摘要: A microprocessor assigns a data transaction type to each instruction. The data transaction type is based upon the encoding of the instruction, and indicates an access mode for memory operations corresponding to the instruction. The access mode may, for example, specify caching and prefetching characteristics for the memory operation. The access mode for each data transaction type is selected to enhance the speed of access by the microprocessor to the data, or to enhance the overall cache and prefetching efficiency of the microprocessor by inhibiting caching and/or prefetching for those memory operations. Instead of relying on data memory access patterns and overall program behavior to determine caching and prefetching operations, these operations are determined on an instruction-by-instruction basis. Additionally, the data transaction types assigned to different instruction encodings may be revealed to program developers. Program developers may use the instruction encodings (and instruction encodings which are assigned to a nil data transaction type causing a default access mode) to optimize use of processor resources during program execution.

    摘要翻译: 微处理器为每个指令分配数据事务类型。 数据交易类型基于指令的编码,并且指示对应于指令的存储器操作的访问模式。 访问模式可以例如指定用于存储器操作的缓存和预取特性。 选择每个数据事务类型的访问模式以增强微处理器对数据的访问速度,或通过禁止对这些存储器操作的高速缓存和/或预取来增强微处理器的总体缓存和预取效率。 不依赖数据存储器访问模式和整体程序行为来确定高速缓存和预取操作,而是依据逐个指令来确定这些操作。 此外,分配给不同指令编码的数据事务类型可能会显示给程序开发人员。 程序开发人员可以使用指令编码(以及分配给导致默认访问模式的零数据事务类型的指令编码)来优化程序执行期间处理器资源的使用。

    LOAD-STORE DEPENDENCY PREDICTOR PC HASHING
    9.
    发明申请
    LOAD-STORE DEPENDENCY PREDICTOR PC HASHING 有权
    负载存储依赖性预测PC冲击

    公开(公告)号:US20130326198A1

    公开(公告)日:2013-12-05

    申请号:US13483268

    申请日:2012-05-30

    IPC分类号: G06F9/30 G06F9/312

    摘要: Methods and processors for managing load-store dependencies in an out-of-order instruction pipeline. A load store dependency predictor includes a table for storing entries for load-store pairs that have been found to be dependent and execute out of order. Each entry in the table includes hashed values to identify load and store operations. When a load or store operation is detected, the PC and an architectural register number are used to create a hashed value that can be used to uniquely identify the operation. Then, the load store dependency predictor table is searched for any matching entries with the same hashed value.

    摘要翻译: 用于在乱序指令流水线中管理加载存储依赖关系的方法和处理器。 加载存储器依赖性预测器包括用于存储已被发现是依赖的并且无序执行的加载 - 存储对的条目的表。 表中的每个条目都包含哈希值以识别加载和存储操作。 当检测到加载或存储操作时,PC和体系结构寄存器号用于创建可用于唯一标识操作的散列值。 然后,搜索具有相同散列值的任何匹配条目的加载存储相关性预测器表。

    LOAD-STORE DEPENDENCY PREDICTOR CONTENT MANAGEMENT
    10.
    发明申请
    LOAD-STORE DEPENDENCY PREDICTOR CONTENT MANAGEMENT 有权
    负载存储依赖性预测内容管理

    公开(公告)号:US20130298127A1

    公开(公告)日:2013-11-07

    申请号:US13464647

    申请日:2012-05-04

    IPC分类号: G06F9/46

    摘要: Methods and apparatuses for managing load-store dependencies in an out-of-order processor. A load store dependency predictor may include a table for storing entries for load-store pairs that have been found to be dependent and execute out of order. Each entry in the table includes a counter to indicate a strength of the dependency prediction. If the counter is above a threshold, a dependency is enforced for the load-store pair. If the counter is below the threshold, the dependency is not enforced for the load-store pair. When a store is dispatched, the table is searched, and any matching entries in the table are armed. If a load is dispatched, matches on an armed entry, and the counter is above the threshold, then the load will wait to issue until the corresponding store issues.

    摘要翻译: 用于在乱序处理器中管理加载存储依赖关系的方法和装置。 加载存储依赖性预测器可以包括用于存储已被发现是依赖的并且无序执行的加载 - 存储对的条目的表。 表中的每个条目包括一个指示依赖性预测强度的计数器。 如果计数器高于阈值,则对于加载存储对执行依赖关系。 如果计数器低于阈值,则不对加载存储对执行依赖关系。 发送商店时,将搜索表格,并且表中的任何匹配条目都被布防。 如果调度了一个负载,则在一个布防的条目上进行匹配,并且计数器高于阈值,则负载将等待发出,直到相应的存储发生。