Data transaction typing for improved caching and prefetching
characteristics
    11.
    发明授权
    Data transaction typing for improved caching and prefetching characteristics 失效
    用于改进缓存和预取特征的数据事务输入

    公开(公告)号:US6151662A

    公开(公告)日:2000-11-21

    申请号:US982720

    申请日:1997-12-02

    摘要: A microprocessor assigns a data transaction type to each instruction. The data transaction type is based upon the encoding of the instruction, and indicates an access mode for memory operations corresponding to the instruction. The access mode may, for example, specify caching and prefetching characteristics for the memory operation. The access mode for each data transaction type is selected to enhance the speed of access by the microprocessor to the data, or to enhance the overall cache and prefetching efficiency of the microprocessor by inhibiting caching and/or prefetching for those memory operations. Instead of relying on data memory access patterns and overall program behavior to determine caching and prefetching operations, these operations are determined on an instruction-by-instruction basis. Additionally, the data transaction types assigned to different instruction encodings may be revealed to program developers. Program developers may use the instruction encodings (and instruction encodings which are assigned to a nil data transaction type causing a default access mode) to optimize use of processor resources during program execution.

    摘要翻译: 微处理器为每个指令分配数据事务类型。 数据交易类型基于指令的编码,并且指示对应于指令的存储器操作的访问模式。 访问模式可以例如指定用于存储器操作的缓存和预取特性。 选择每个数据事务类型的访问模式以增强微处理器对数据的访问速度,或通过禁止对这些存储器操作的高速缓存和/或预取来增强微处理器的总体缓存和预取效率。 不依赖数据存储器访问模式和整体程序行为来确定高速缓存和预取操作,而是依据逐个指令来确定这些操作。 此外,分配给不同指令编码的数据事务类型可能会显示给程序开发人员。 程序开发人员可以使用指令编码(以及分配给导致默认访问模式的零数据事务类型的指令编码)来优化程序执行期间处理器资源的使用。

    LOAD-STORE DEPENDENCY PREDICTOR PC HASHING
    12.
    发明申请
    LOAD-STORE DEPENDENCY PREDICTOR PC HASHING 有权
    负载存储依赖性预测PC冲击

    公开(公告)号:US20130326198A1

    公开(公告)日:2013-12-05

    申请号:US13483268

    申请日:2012-05-30

    IPC分类号: G06F9/30 G06F9/312

    摘要: Methods and processors for managing load-store dependencies in an out-of-order instruction pipeline. A load store dependency predictor includes a table for storing entries for load-store pairs that have been found to be dependent and execute out of order. Each entry in the table includes hashed values to identify load and store operations. When a load or store operation is detected, the PC and an architectural register number are used to create a hashed value that can be used to uniquely identify the operation. Then, the load store dependency predictor table is searched for any matching entries with the same hashed value.

    摘要翻译: 用于在乱序指令流水线中管理加载存储依赖关系的方法和处理器。 加载存储器依赖性预测器包括用于存储已被发现是依赖的并且无序执行的加载 - 存储对的条目的表。 表中的每个条目都包含哈希值以识别加载和存储操作。 当检测到加载或存储操作时,PC和体系结构寄存器号用于创建可用于唯一标识操作的散列值。 然后,搜索具有相同散列值的任何匹配条目的加载存储相关性预测器表。

    LOAD-STORE DEPENDENCY PREDICTOR CONTENT MANAGEMENT
    13.
    发明申请
    LOAD-STORE DEPENDENCY PREDICTOR CONTENT MANAGEMENT 有权
    负载存储依赖性预测内容管理

    公开(公告)号:US20130298127A1

    公开(公告)日:2013-11-07

    申请号:US13464647

    申请日:2012-05-04

    IPC分类号: G06F9/46

    摘要: Methods and apparatuses for managing load-store dependencies in an out-of-order processor. A load store dependency predictor may include a table for storing entries for load-store pairs that have been found to be dependent and execute out of order. Each entry in the table includes a counter to indicate a strength of the dependency prediction. If the counter is above a threshold, a dependency is enforced for the load-store pair. If the counter is below the threshold, the dependency is not enforced for the load-store pair. When a store is dispatched, the table is searched, and any matching entries in the table are armed. If a load is dispatched, matches on an armed entry, and the counter is above the threshold, then the load will wait to issue until the corresponding store issues.

    摘要翻译: 用于在乱序处理器中管理加载存储依赖关系的方法和装置。 加载存储依赖性预测器可以包括用于存储已被发现是依赖的并且无序执行的加载 - 存储对的条目的表。 表中的每个条目包括一个指示依赖性预测强度的计数器。 如果计数器高于阈值,则对于加载存储对执行依赖关系。 如果计数器低于阈值,则不对加载存储对执行依赖关系。 发送商店时,将搜索表格,并且表中的任何匹配条目都被布防。 如果调度了一个负载,则在一个布防的条目上进行匹配,并且计数器高于阈值,则负载将等待发出,直到相应的存储发生。

    Determination of execution resource allocation based on concurrently executable misaligned memory operations
    14.
    发明授权
    Determination of execution resource allocation based on concurrently executable misaligned memory operations 失效
    基于同时执行的对齐内存操作确定执行资源分配

    公开(公告)号:US06704854B1

    公开(公告)日:2004-03-09

    申请号:US09433185

    申请日:1999-10-25

    IPC分类号: G06F930

    摘要: A processor includes execution resources for handling a first memory operation and a concurrent second memory operation. If one of the memory operations is misaligned, the processor may allocate the execution resources for the other memory operation to that memory operation. In one embodiment, the older memory operation proceeds if misalignment is detected. The younger memory operation is retried and may be reexecuted at a later time. If the older memory operation is misaligned, the execution resources provided for the younger operation may be allocated to the older memory operation. If only the younger memory operation is misaligned, the younger memory operation may be the older memory operation during a subsequent reexecution and may thus be allocated the execution resources to allow the memory operation to complete.

    摘要翻译: 处理器包括用于处理第一存储器操作和并行第二存储器操作的执行资源。 如果其中一个存储器操作未对准,则处理器可以将该另一存储器操作的执行资源分配给该存储器操作。 在一个实施例中,如果检测到未对准,则旧的存储器操作进行。 较年轻的内存操作被重试,可能会在以后重新执行。 如果较旧的内存操作未对齐,则为年轻操作提供的执行资源可能会分配给较旧的内存操作。 如果仅较年轻的存储器操作未对准,则较小的存储器操作可能是在随后的重新执行期间较旧的存储器操作,因此可以分配执行资源以允许存储器操作完成。

    Providing global translations with address space numbers
    15.
    发明授权
    Providing global translations with address space numbers 有权
    提供地址空间编号的全球翻译

    公开(公告)号:US06604187B1

    公开(公告)日:2003-08-05

    申请号:US09596636

    申请日:2000-06-19

    IPC分类号: G06F1208

    摘要: A processor provides a register for storing an address space number (ASN). Operating system software may assign different ASNs to different processes. The processor may include a TLB to cache translations, and the TLB may record the ASN from the ASN register in a TLB entry being loaded. Thus, translations may be associated with processes through the ASNs. Generally, a TLB hit will be detected in an entry if the virtual address to be translated matches the virtual address tag and the ASN matches the ASN stored in the register. Additionally, the processor may use an indication from the translation table entries to indicate whether or not a translation is global. If a translation is global, then the ASN comparison is not included in detecting a hit in the TLB. Thus, translations which are used by more than one process may not occupy multiple TLB entries. Instead, a hit may be detected on the TLB entry storing the global translation even though the recorded ASN may not match the current ASN. In one embodiment, if ASNs are disabled, the TLB may be flushed on context switches. However, the indication from the translation table entries used to indicate that the translation is global may be used (when ASNs are disabled) by the TLB to selectively invalidate non-global translations on a context switch while not invalidating global translations.

    摘要翻译: 处理器提供用于存储地址空间号(ASN)的寄存器。 操作系统软件可以将不同的ASN分配给不同的进程。 处理器可以包括用于高速缓存转换的TLB,并且TLB可以在正在加载的TLB条目中从ASN寄存器记录ASN。 因此,翻译可以通过ASN与进程相关联。 通常,如果要转换的虚拟地址与虚拟地址标签匹配并且ASN与存储在寄存器中的ASN相匹配,则将在条目中检测到TLB命中。 另外,处理器可以使用来自转换表条目的指示来指示翻译是否是全局的。 如果翻译是全局的,则在检测到TLB中的命中时不包括ASN比较。 因此,由多个进程使用的转换可能不占用多个TLB条目。 相反,即使记录的ASN可能与当前ASN不匹配,也可以在存储全局转换的TLB条目上检测到命中。 在一个实施例中,如果ASN被禁用,则可以在上下文切换上刷新TLB。 然而,可以使用用于指示翻译是全局的翻译表条目的指示(当ASN被禁用时)由TLB选择性地使上下文切换上的非全局翻译无效,而不会使全局翻译无效。

    Optimized allocation of multi-pipeline executable and specific pipeline executable instructions to execution pipelines based on criteria
    16.
    发明授权
    Optimized allocation of multi-pipeline executable and specific pipeline executable instructions to execution pipelines based on criteria 有权
    根据标准优化多管道可执行和特定管道可执行指令的分配到执行管道

    公开(公告)号:US06370637B1

    公开(公告)日:2002-04-09

    申请号:US09370789

    申请日:1999-08-05

    IPC分类号: G06F938

    摘要: A microprocessor with a floating point unit configured to efficiently allocate multi-pipeline executable instructions is disclosed. Multi-pipeline executable instructions are instructions that are not forced to execute in a particular type of execution pipe. For example, junk ops are multi-pipeline executable. A junk op is an instruction that is executed at an early stage of the floating point unit's pipeline (e.g., during register rename), but still passes through an execution pipeline for exception checking. Junk ops are not limited to a particular execution pipeline, but instead may pass through any of the microprocessor's execution pipelines in the floating point unit. Multi-pipeline executable instructions are allocated on a per-clock cycle basis using a number of different criteria. For example, the allocation may vary depending upon the number of multi-pipeline executable instructions received by the floating point unit in a single clock cycle.

    摘要翻译: 公开了一种具有配置成有效地分配多流水线可执行指令的浮点单元的微处理器。 多管道可执行指令是不强制在特定类型执行管道中执行的指令。 例如,垃圾操作是多管道可执行的。 垃圾操作是在浮点单元的流水线的早期执行的指令(例如,在寄存器重命名期间),但是仍然通过用于异常检查的执行管线。 垃圾操作不限于特定的执行管道,而是可以通过浮点单元中的任何一个微处理器的执行流水线。 使用许多不同的标准,在每个时钟周期的基础上分配多流水线可执行指令。 例如,分配可以根据浮点单元在单个时钟周期中接收的多流水线可执行指令的数量而变化。

    Method and system for architectural power estimation
    18.
    发明授权
    Method and system for architectural power estimation 有权
    建筑功率估计方法与系统

    公开(公告)号:US07051300B1

    公开(公告)日:2006-05-23

    申请号:US10655390

    申请日:2003-09-04

    IPC分类号: G06F17/50

    CPC分类号: G06F17/5022 G06F2217/78

    摘要: A method is provided for architectural integrated circuit power estimation. The method may include receiving a plurality of respective energy events, receiving a plurality of base-level energy models, and generating a plurality of power models. Each power model may hierarchically instantiate one or more of the base-level energy models. The method may further include mapping each respective energy event to one or more of the plurality of power models. The method may further include hierarchically evaluating a particular base-level energy model corresponding to a given respective energy event, estimating an energy associated with evaluation of the particular base-level energy model, and accumulating the energy in a power estimate corresponding to the given respective energy event.

    摘要翻译: 提供了一种用于建筑集成电路功率估计的方法。 该方法可以包括接收多个相应的能量事件,接收多个基本能级模型,以及生成多个功率模型。 每个功率模型可以分级地实例化一个或多个基本能级模型。 该方法还可以包括将每个相应的能量事件映射到多个功率模型中的一个或多个。 该方法还可以包括分级地评估对应于给定的相应能量事件的特定基本能级模型,估计与特定基本能级模型的评估相关联的能量,以及将能量累积在对应于给定相应 能量事件。

    Method and apparatus for denormal load handling
    19.
    发明授权
    Method and apparatus for denormal load handling 有权
    用于异常负载处理的方法和装置

    公开(公告)号:US06487653B1

    公开(公告)日:2002-11-26

    申请号:US09383138

    申请日:1999-08-25

    IPC分类号: G06F738

    摘要: A microprocessor configured to dynamically switch its floating point load pipeline length from one stage in length to more than one stage in length is disclosed. The microprocessor may perform normal loads and detect denormal loads in a single clock cycle. The microprocessor temporarily stores each scheduled floating point instruction in a reissue buffer for at least one clock cycle. When a denormal load instruction is detected, the microprocessor is configured to add one or more stages to the floating point load pipeline to allow the denormal value to complete the conversion to an internal format. The longer pipeline is then used for all loads that follow the denormal load until there is an idle clock cycle or an abort occurs. At that point, the pipeline reverts back to its original shorter state. In addition, the microprocessor may be configured to cancel instructions scheduled assuming the denormal load would take only one clock cycle to complete. The canceled instruction is then “replayed” during a later clock cycle from the reissue buffer. A method for performing denormal loads and a computer system are also disclosed.

    摘要翻译: 公开了一种被配置为将其浮点负载流水线长度从一个阶段长度动态地切换到多于一个阶段的微处理器。 微处理器可以在单个时钟周期内执行正常负载并检测异常负载。 微处理器将至少一个时钟周期的每个调度的浮点指令临时存储在再发行缓冲器中。 当检测到非正常加载指令时,微处理器被配置为向浮点加载流水线添加一个或多个级,以允许异常值完成到内部格式的转换。 然后,较长的流水线将用于跟随异常负载的所有负载,直到发生空闲时钟周期或中止发生。 在这一点上,管道恢复到原来的较短状态。 此外,微处理器可以被配置为取消预定的指令,假设正常负载仅需要一个时钟周期来完成。 然后在从重新发行缓冲区的较后时钟周期内“取消”取消的指令。 还公开了一种用于执行异常负载的方法和计算机系统。

    Converting register data from a first format type to a second format
type if a second type instruction consumes data produced by a first
type instruction
    20.
    发明授权
    Converting register data from a first format type to a second format type if a second type instruction consumes data produced by a first type instruction 失效
    如果第二类型指令消耗由第一类型指令产生的数据,则将寄存器数据从第一格式类型转换为第二格式类型

    公开(公告)号:US6105129A

    公开(公告)日:2000-08-15

    申请号:US25233

    申请日:1998-02-18

    摘要: A microprocessor includes one or more registers which are architecturally defined to be used for at least two data formats. In one embodiment, the registers are the floating point registers defined in the x86 architecture, and the data formats are the floating point data format and the multimedia data format. The registers actually implemented by the microprocessor for the floating point registers use an internal format for floating point data. Part of the internal format is a classification field which classifies the floating point data in the extended precision defined by the x86 microprocessor architecture. Additionally, a classification field encoding is reserved for multimedia data. As the microprocessor begins execution of each multimedia instruction, the classification information of the source operands is examined to determine if the data is either in the multimedia class, or in a floating point class in which the significand portion of the register is the same as the corresponding significand in extended precision. If so, the multimedia instruction executes normally. If not, the multimedia instruction is faulted. Similarly, as the microprocessor begins execution of each floating point instruction, the classification information of the source operands is examined. If the data is classified as multimedia, the floating point instruction is faulted. A microcode routine is used to reformat the data stored in at least the source registers of the faulting instruction into a format useable by the faulting instruction. Subsequently, the faulting instruction is re-executed.

    摘要翻译: 微处理器包括一个或多个寄存器,其被架构地定义为用于至少两种数据格式。 在一个实施例中,寄存器是在x86架构中定义的浮点寄存器,数据格式是浮点数据格式和多媒体数据格式。 微处理器为浮点寄存器实际实现的寄存器使用浮点数据的内部格式。 内部格式的一部分是分类字段,它以由x86微处理器架构定义的扩展精度对浮点数据进行分类。 此外,分类字段编码被保留用于多媒体数据。 当微处理器开始执行每个多媒体指令时,检查源操作数的分类信息以确定数据是在多媒体类中还是在浮点类中,其中寄存器的有效部分与 相应的显着性在扩展精度。 如果是这样,多媒体指令正常执行。 如果不是,则多媒体指令发生故障。 类似地,当微处理器开始执行每个浮点指令时,检查源操作数的分类信息。 如果数据被分类为多媒体,则浮点指令发生故障。 微码程序用于将存储在故障指令的至少源寄存器中的数据重新格式化为故障指令可使用的格式。 随后,重新执行故障指令。