Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache
    72.
    发明授权
    Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache 失效
    至少通过将访问位与高速缓存行相关联并将粒度比特与高级缓存中的高速缓存行相关联的方法,用于至少基于在运行时期内产生的热量分析来重配置缓存存储器

    公开(公告)号:US07467280B2

    公开(公告)日:2008-12-16

    申请号:US11481020

    申请日:2006-07-05

    IPC分类号: G06F12/00

    摘要: A method for reconfiguring a cache memory is provided. The method in one aspect may include analyzing one or more characteristics of an execution entity accessing a cache memory and reconfiguring the cache based on the one or more characteristics analyzed. Examples of analyzed characteristic may include but are not limited to data structure used by the execution entity, expected reference pattern of the execution entity, type of an execution entity, heat and power consumption of an execution entity, etc. Examples of cache attributes that may be reconfigured may include but are not limited to associativity of the cache memory, amount of the cache memory available to store data, coherence granularity of the cache memory, line size of the cache memory, etc.

    摘要翻译: 提供了一种重新配置高速缓冲存储器的方法。 一个方面中的方法可以包括分析访问高速缓冲存储器的执行实体的一个或多个特征,并且基于所分析的一个或多个特征重新配置高速缓存。 分析特性的示例可以包括但不限于执行实体使用的数据结构,执行实体的预期参考模式,执行实体的类型,执行实体的热和功耗。等等 重新配置可以包括但不限于高速缓冲存储器的相关性,可用于存储数据的高速缓冲存储器的量,高速缓冲存储器的相干粒度,高速缓存存储器的行大小等。

    SYSTEM AND STRUCTURE FOR SYNCHRONIZED THREAD PRIORITY SELECTION IN A DEEPLY PIPELINED MULTITHREADED MICROPROCESSOR
    73.
    发明申请
    SYSTEM AND STRUCTURE FOR SYNCHRONIZED THREAD PRIORITY SELECTION IN A DEEPLY PIPELINED MULTITHREADED MICROPROCESSOR 审中-公开
    深层管道多路径微处理器中同步螺纹优先选择的系统和结构

    公开(公告)号:US20080263325A1

    公开(公告)日:2008-10-23

    申请号:US11737491

    申请日:2007-04-19

    IPC分类号: G06F9/30

    CPC分类号: G06F9/3851

    摘要: A microprocessor and system with improved performance and power in simultaneous multithreading (SMT) microprocessor architecture. The microprocessor and system includes a process wherein the processor has the ability to select instructions from one thread or another in any given processor clock cycle. Instructions from each, thread may be assigned selection priorities at multiple decision points in a processor in a given cycle dynamically. The thread priority is based on monitoring performance behavior and activities in the processor. In the exemplary embodiment, the present invention discloses a microprocessor and system for synchronizing thread priorities among multiple decision points throughout the micro-architecture of the microprocessor. This system and method for synchronizing thread priorities allows each thread priority to he in sync and aware of the status of other thread priorities at various decision points within the microprocessor.

    摘要翻译: 具有同步多线程(SMT)微处理器架构的具有改进的性能和功耗的微处理器和系统。 微处理器和系统包括处理器,其中处理器能够在任何给定的处理器时钟周期中从一个线程或另一个线程中选择指令。 来自每个线程的指令可以在给定周期中的处理器中的多个决策点动态地分配选择优先级。 线程优先级基于监视处理器中的性能行为和活动。 在示例性实施例中,本发明公开了一种微处理器和系统,用于在整个微处理器的微架构中的多个决策点之间同步线程优先级。 这种用于同步线程优先级的系统和方法允许每个线程优先级同步并且在微处理器内的各个决定点处知道其他线程优先级的状态。

    APPARATUS FOR ADJUSTING INSTRUCTION THREAD PRIORITY IN A MULTI-THREAD PROCESSOR
    74.
    发明申请
    APPARATUS FOR ADJUSTING INSTRUCTION THREAD PRIORITY IN A MULTI-THREAD PROCESSOR 有权
    用于调整多线程处理器中的指令优先级的设备

    公开(公告)号:US20080155233A1

    公开(公告)日:2008-06-26

    申请号:US12044846

    申请日:2008-03-07

    IPC分类号: G06F9/30

    CPC分类号: G06F9/4818 G06F9/3851

    摘要: Each instruction thread in a SMT processor is associated with a software assigned base input processing priority. Unless some predefined event or circumstance occurs with an instruction being processed or to be processed, the base input processing priorities of the respective threads are used to determine the interleave frequency between the threads according to some instruction interleave rule. However, upon the occurrence of some predefined event or circumstance in the processor related to a particular instruction thread, the base input processing priority of one or more instruction threads is adjusted to produce one more adjusted priority values. The instruction interleave rule is then enforced according to the adjusted priority value or values together with any base input processing priority values that have not been subject to adjustment.

    摘要翻译: SMT处理器中的每个指令线程与软件分配的基本输入处理优先级相关联。 除非正在处理或要处理的指令发生一些预定义的事件或情况,否则各个线程的基本输入处理优先级用于根据某种指令交错规则来确定线程之间的交织频率。 然而,在与特定指令线程相关的处理器中发生某些预定义的事件或环境时,调整一个或多个指令线程的基本输入处理优先级以产生一个更多调整的优先级值。 然后根据调整后的优先级值或与未经调整的任何基本输入处理优先级值一起实施指令交错规则。

    Cache residence prediction
    75.
    发明授权
    Cache residence prediction 失效
    缓存居住预测

    公开(公告)号:US07266642B2

    公开(公告)日:2007-09-04

    申请号:US10779999

    申请日:2004-02-17

    IPC分类号: G06F12/00

    摘要: The present invention proposes a novel cache residence prediction mechanism that predicts whether requested data of a cache miss can be found in another cache. The memory controller can use the prediction result to determine if it should immediately initiate a memory access, or initiate no memory access until a cache snoop response shows that the requested data cannot be supplied by a cache.The cache residence prediction mechanism can be implemented at the cache side, the memory side, or both. A cache-side prediction mechanism can predict that data requested by a cache miss can be found in another cache if the cache miss address matches an address tag of a cache line in the requesting cache and the cache line is in an invalid state. A memory-side prediction mechanism can make effective prediction based on observed memory and cache operations that are recorded in a prediction table.

    摘要翻译: 本发明提出了一种新颖的缓存驻留预测机制,其预测是否可以在另一个高速缓存中找到所请求的高速缓存未命中的数据。 存储器控制器可以使用预测结果来确定它是否应该立即启动存储器访问,或者不启动存储器访问,直到高速缓存侦听响应显示所请求的数据不能被高速缓存提供。 缓存驻留预测机制可以在高速缓存侧,存储器侧或两者中实现。 如果高速缓存未命中地址与请求高速缓存中的高速缓存线的地址标签匹配并且高速缓存行处于无效状态,则高速缓存侧预测机制可以预测在另一个高速缓存中可以找到由高速缓存未命中请求的数据。 存储器侧预测机制可以基于记录在预测表中的观察到的存储器和高速缓存操作来进行有效的预测。

    Method and apparatus for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit
    76.
    发明申请
    Method and apparatus for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit 失效
    用于使用多态功能单元在微处理器中的架构单元之间共享存储和执行资源的方法和装置

    公开(公告)号:US20070162726A1

    公开(公告)日:2007-07-12

    申请号:US11329320

    申请日:2006-01-10

    IPC分类号: G06F9/40

    摘要: Methods and apparatus are provided for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit. A method for executing instructions in a processor having a polymorphic execution unit includes the steps of reloading a state associated with a first instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the first instruction class, when an instruction of the first instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with a second instruction class. The method also includes the steps of reloading a state associated with a second instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the second instruction class, when an instruction of the second instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with the first instruction class.

    摘要翻译: 提供了用于使用多态功能单元在微处理器中的架构单元之间共享存储和执行资源的方法和装置。 一种用于在具有多态执行单元的处理器中执行指令的方法,包括以下步骤:当所述第一指令类的指令时,重新加载与第一指令类相关联的状态并重新配置所述多态执行单元以根据所述第一指令类进行操作 并且多态执行单元被配置为根据第二指令类进行操作。 该方法还包括以下步骤:当遇到第二指令类的指令时,重新加载与第二指令类相关联的状态并重新配置多项式执行单元以根据第二指令类进行操作,并且将多态执行单元配置为 按照第一个指导班进行操作。

    Propagating data using mirrored lock caches
    77.
    发明申请
    Propagating data using mirrored lock caches 失效
    使用镜像锁高速缓存传播数据

    公开(公告)号:US20070150665A1

    公开(公告)日:2007-06-28

    申请号:US11315465

    申请日:2005-12-22

    IPC分类号: G06F12/14 G06F12/16

    摘要: A method, processing node, and computer readable medium for propagating data using mirrored lock caches are disclosed. The method includes coupling a first mirrored lock cache associated with a first processing node to a bus that is communicatively coupled to at least a second mirrored lock cache associated with a second processing node in a multi-processing system. The method further includes receiving, by the first mirrored lock cache, data from a processing node. The data is then mirrored automatically so that the same data is available locally at the second mirrored lock cache for use by the second processing node.

    摘要翻译: 公开了一种用于使用镜像锁高速缓存传播数据的方法,处理节点和计算机可读介质。 该方法包括将与第一处理节点相关联的第一镜像锁缓存耦合到通信地耦合到与多处理系统中的第二处理节点相关联的至少第二镜像锁高速缓存的总线。 该方法还包括由第一镜像锁高速缓存从处理节点接收数据。 然后自动镜像数据,以便相同的数据在第二个镜像锁缓存器本地可用,供第二个处理节点使用。

    Methods and arrangements to manage on-chip memory to reduce memory latency
    78.
    发明申请
    Methods and arrangements to manage on-chip memory to reduce memory latency 有权
    管理片上存储器以减少内存延迟的方法和安排

    公开(公告)号:US20060155886A1

    公开(公告)日:2006-07-13

    申请号:US11032876

    申请日:2005-01-11

    IPC分类号: G06F3/00

    摘要: Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.

    摘要翻译: 考虑通过操作系统提供对软件应用(OCM)的控制的措施来减少处理器所看到的存储器延迟的方法,系统和媒体。 许多实施例允许OCM的一部分由软件应用程序通过应用程序接口(API)和由硬件管理的部分来管理。 因此,软件应用程序可以提供关于地址范围的指导,以保持靠近处理器,以减少在依赖于缓存控制器策略时通常遇到的不必要的延迟。 几个实施例利用处理器内部或处理器节点上的存储器,因此用于该技术的存储器块被称为OCM。

    Circuits, systems and methods for performing branch predictions by selectively accessing bimodal and fetch-based history tables
    79.
    发明授权
    Circuits, systems and methods for performing branch predictions by selectively accessing bimodal and fetch-based history tables 失效
    通过有选择地访问双峰和基于获取的历史表来执行分支预测的电路,系统和方法

    公开(公告)号:US06976157B1

    公开(公告)日:2005-12-13

    申请号:US09435070

    申请日:1999-11-04

    申请人: Balaram Sinharoy

    发明人: Balaram Sinharoy

    IPC分类号: G06F9/00

    CPC分类号: G06F9/3806 G06F9/3848

    摘要: Branch prediction circuitry including a bimodal branch history table, a fetch-based branch history table and a selector table is provided. The local branch history table includes a plurality of entries each for storing a prediction value and accessed by selected bits of a branch address. The fetch-based branch history table included a plurality of entries for storing a prediction value and accessed by a pointer generated from selected bits of the branch address and bits from a history register. The selector table includes a plurality of entries each for storing a selection bit and accessed by a pointer generated from selected bits from the branch address and bits from the history register, each selector bit is used for selecting between a prediction value accessed from the local history table and a prediction value accessed from the fetch-based history table.

    摘要翻译: 提供了包括双模分支历史表,基于获取的分支历史表和选择表的分支预测电路。 本地分支历史表包括多个条目,每个条目用于存储预测值,并由分支地址的选定位访问。 基于获取的分支历史表包括用于存储预测值的多个条目,并且由从分支地址的选定位产生的指针和来自历史寄存器的位进行访问的条目。 选择器表包括多个条目,每个条目用于存储选择位,并由从分支地址的选定位产生的指针和来自历史寄存器的位进行访问,每个选择器位用于在从本地历史访问的预测值之间进行选择 表和从基于获取的历史表访问的预测值。

    Method and apparatus for capturing event traces for debug and analysis
    80.
    发明授权
    Method and apparatus for capturing event traces for debug and analysis 失效
    捕获事件跟踪的方法和装置,用于调试和分析

    公开(公告)号:US06961875B2

    公开(公告)日:2005-11-01

    申请号:US09815548

    申请日:2001-03-22

    IPC分类号: G06F9/44 G06F11/00 G06F11/36

    CPC分类号: G06F11/3636

    摘要: A trace array having M entries with corresponding M addresses is used to store the states of input signals. The M addresses of the trace array are sequenced with a counter that counts a clock beginning at a starting count and counting to an ending count. If the ending count is exceeded, the counter starts over at the starting count. The counter outputs are decoded to addresses of the trace array. An event signal is generated on the occurrence of an operation of interest and the counter is started and stopped in response to sequences of the event signals, thus starting and stopping the recording of states of the input signals in the trace array. When an error or particular condition signal occurs, traces corresponding to the input signals are saved in the trace array. A start signal enables tracing and event logic generates event sequence signals which alternately start and stop the recording of traces. The event sequences are programmed by inputs to enable guaranteed statistical chances of capturing states of the input signals corresponding to a particular event signal occurring before an error or another event signal.

    摘要翻译: 使用具有对应M地址的M个条目的跟踪数组来存储输入信号的状态。 跟踪数组的M地址用计数器计数,该计数器从起始计数开始计数一个时钟,并计数到结束计数。 如果超出结束计数,则计数器从起始计数开始。 计数器输出被解码为跟踪数组的地址。 在感兴趣的操作的发生时产生事件信号,并且响应于事件信号的序列开始和停止计数器,从而启动和停止跟踪阵列中的输入信号的状态的记录。 当发生错误或特定条件信号时,对应于输入信号的迹线将保存在跟踪数组中。 起始信号使跟踪和事件逻辑产生交替地启动和停止记录记录的事件序列信号。 事件序列由输入编程,以使保证的统计机会能够捕捉与在错误或其它事件信号之前发生的特定事件信号相对应的输入信号的状态。