Latency-aware thread scheduling in non-uniform cache architecture systems
    71.
    发明授权
    Latency-aware thread scheduling in non-uniform cache architecture systems 有权
    在非均匀缓存架构系统中的延迟感知线程调度

    公开(公告)号:US07574562B2

    公开(公告)日:2009-08-11

    申请号:US11491413

    申请日:2006-07-21

    IPC分类号: G06F12/02

    CPC分类号: G06F12/0842 G06F2212/271

    摘要: A system and method for latency-aware thread scheduling in non-uniform cache architecture are provided. Instructions may be provided to the hardware specifying in which banks to store data. Information as to which banks store which data may also be provided, for example, by the hardware. This information may be used to schedule threads on one or more cores. A selected bank in cache memory may be reserved strictly for selected data.

    摘要翻译: 提供了一种用于在非均匀缓存体系结构中进行延迟识别的线程调度的系统和方法。 可以向硬件提供指令,指定哪些存储体存储数据。 关于哪些银行存储哪些数据的信息也可以由硬件提供。 该信息可用于在一个或多个核心上调度线程。 高速缓冲存储器中的选定存储区可能被严格保留用于所选数据。

    TERMINATION OF IN-FLIGHT ASYNCHRONOUS MEMORY MOVE
    72.
    发明申请
    TERMINATION OF IN-FLIGHT ASYNCHRONOUS MEMORY MOVE 有权
    飞行异常记忆移动的终止

    公开(公告)号:US20090198975A1

    公开(公告)日:2009-08-06

    申请号:US12024546

    申请日:2008-02-01

    IPC分类号: G06F9/315

    摘要: A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: (1) an asynchronous memory mover (AMM) store (ST) instruction initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides (2) an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and (3) a LD CMP instruction for checking a status of an AMM operation.

    摘要翻译: 数据处理系统具有处理器,存储器和指令集架构(ISA),其包括:(1)异步存储器移动器(AMM)存储器(ST)指令发起异步存储器移动操作,其将数据从第一存储器 具有通过以下方式具有第二实际地址的具有第一实际地址的位置:(a)首先使用源有效地址执行虚拟地址空间中的数据移动目的地有效地址; 和(b)当移动完成时,完成数据到第二存储器位置的物理移动,而与处理器无关。 ISA进一步提供(2)在完成AMM操作之前停止正在进行的AMM操作的AMM终止ST指令,以及(3)用于检查AMM操作状态的LD CMP指令。

    LAUNCHING MULTIPLE CONCURRENT MEMORY MOVES VIA A FULLY ASYNCHRONOOUS MEMORY MOVER
    73.
    发明申请
    LAUNCHING MULTIPLE CONCURRENT MEMORY MOVES VIA A FULLY ASYNCHRONOOUS MEMORY MOVER 失效
    启动多个同时存储器通过充分的异步存储器移动

    公开(公告)号:US20090198939A1

    公开(公告)日:2009-08-06

    申请号:US12024690

    申请日:2008-02-01

    IPC分类号: G06F12/02

    摘要: A data processing system has an asynchronous memory mover, which includes multiple sets of registers for storing addressing and control parameters utilized to generate one or more asynchronous memory move (AMM) operations. The memory mover detects a receipt of a first set of parameters in a first set of registers from the processor. The processor forwards the parameters after the processor initiates a data move in virtual address space, utilizing a source effective address and a destination effective address. The memory mover responds to receiving the first set of parameters by generating and launching a first asynchronous memory move (AMM) operation. When the memory mover receives a second set of parameters in a second set of registers before the first AMM operation completes, the memory mover generates and launches a second AMM operation concurrently with the first AMM operation if no address conflicts exist.

    摘要翻译: 数据处理系统具有异步存储器移动器,其包括用于存储用于生成一个或多个异步存储器移动(AMM)操作的寻址和控制参数的多组寄存器。 存储器移动器检测来自处理器的第一组寄存器中的第一组参数的接收。 处理器在虚拟地址空间中启动数据移动后,使用源有效地址和目标有效地址,处理器转发参数。 存储器移动器响应于通过生成和启动第一异步存储器移动(AMM)操作来接收第一组参数。 当存储器移动器在第一个AMM操作完成之前在第二组寄存器中接收到第二组参数时,如果不存在地址冲突,则存储器移动器生成并与第一个AMM操作同时启动第二个AMM操作。

    METHOD FOR ENABLING DIRECT PREFETCHING OF DATA DURING ASYCHRONOUS MEMORY MOVE OPERATION
    74.
    发明申请
    METHOD FOR ENABLING DIRECT PREFETCHING OF DATA DURING ASYCHRONOUS MEMORY MOVE OPERATION 失效
    用于在异步存储器运行期间实现数据的直接预先提取的方法

    公开(公告)号:US20090198908A1

    公开(公告)日:2009-08-06

    申请号:US12024598

    申请日:2008-02-01

    IPC分类号: G06F12/00

    摘要: While an AMM operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers a cache injection by the AMM mover (or memory controller) of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache. The memory controller also forwards the next cache lines in the sequence of data to the L2 cache and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller does not overrun the upper caches, by placing moved data into only a subset of the available cache lines of the upper level cache.

    摘要翻译: 当AMM操作正在进行时,来自源有效地址或目的地有效地址的数据的预取请求触发AMM移动器(或存储器控制器)从在物理存储器中移动的数据流中的相关数据的高速缓存注入。 存储器控制器将第一预取行转发到预取引擎和L1缓存。 存储器控制器还将数据序列中的下一个高速缓存行转发到L2高速缓存以及随后的一组高速缓存行到L3高速缓存。 存储器控制器然后将剩余的数据转发到目的地存储器位置。 通过缓存高速缓存中的数据流,而不是将所有移动的数据放在内存中,可以快速访问预取数据。 此外,通过将移动的数据仅放置在高级缓存的可用高速缓存行的一部分中,存储器控制器不会超过上部高速缓存。

    Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache
    76.
    发明授权
    Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache 失效
    至少通过将访问位与高速缓存行相关联并将粒度比特与高级缓存中的高速缓存行相关联的方法,用于至少基于在运行时期内产生的热量分析来重配置缓存存储器

    公开(公告)号:US07467280B2

    公开(公告)日:2008-12-16

    申请号:US11481020

    申请日:2006-07-05

    IPC分类号: G06F12/00

    摘要: A method for reconfiguring a cache memory is provided. The method in one aspect may include analyzing one or more characteristics of an execution entity accessing a cache memory and reconfiguring the cache based on the one or more characteristics analyzed. Examples of analyzed characteristic may include but are not limited to data structure used by the execution entity, expected reference pattern of the execution entity, type of an execution entity, heat and power consumption of an execution entity, etc. Examples of cache attributes that may be reconfigured may include but are not limited to associativity of the cache memory, amount of the cache memory available to store data, coherence granularity of the cache memory, line size of the cache memory, etc.

    摘要翻译: 提供了一种重新配置高速缓冲存储器的方法。 一个方面中的方法可以包括分析访问高速缓冲存储器的执行实体的一个或多个特征,并且基于所分析的一个或多个特征重新配置高速缓存。 分析特性的示例可以包括但不限于执行实体使用的数据结构,执行实体的预期参考模式,执行实体的类型,执行实体的热和功耗。等等 重新配置可以包括但不限于高速缓冲存储器的相关性,可用于存储数据的高速缓冲存储器的量,高速缓冲存储器的相干粒度,高速缓存存储器的行大小等。

    SYSTEM AND STRUCTURE FOR SYNCHRONIZED THREAD PRIORITY SELECTION IN A DEEPLY PIPELINED MULTITHREADED MICROPROCESSOR
    77.
    发明申请
    SYSTEM AND STRUCTURE FOR SYNCHRONIZED THREAD PRIORITY SELECTION IN A DEEPLY PIPELINED MULTITHREADED MICROPROCESSOR 审中-公开
    深层管道多路径微处理器中同步螺纹优先选择的系统和结构

    公开(公告)号:US20080263325A1

    公开(公告)日:2008-10-23

    申请号:US11737491

    申请日:2007-04-19

    IPC分类号: G06F9/30

    CPC分类号: G06F9/3851

    摘要: A microprocessor and system with improved performance and power in simultaneous multithreading (SMT) microprocessor architecture. The microprocessor and system includes a process wherein the processor has the ability to select instructions from one thread or another in any given processor clock cycle. Instructions from each, thread may be assigned selection priorities at multiple decision points in a processor in a given cycle dynamically. The thread priority is based on monitoring performance behavior and activities in the processor. In the exemplary embodiment, the present invention discloses a microprocessor and system for synchronizing thread priorities among multiple decision points throughout the micro-architecture of the microprocessor. This system and method for synchronizing thread priorities allows each thread priority to he in sync and aware of the status of other thread priorities at various decision points within the microprocessor.

    摘要翻译: 具有同步多线程(SMT)微处理器架构的具有改进的性能和功耗的微处理器和系统。 微处理器和系统包括处理器,其中处理器能够在任何给定的处理器时钟周期中从一个线程或另一个线程中选择指令。 来自每个线程的指令可以在给定周期中的处理器中的多个决策点动态地分配选择优先级。 线程优先级基于监视处理器中的性能行为和活动。 在示例性实施例中,本发明公开了一种微处理器和系统,用于在整个微处理器的微架构中的多个决策点之间同步线程优先级。 这种用于同步线程优先级的系统和方法允许每个线程优先级同步并且在微处理器内的各个决定点处知道其他线程优先级的状态。

    APPARATUS FOR ADJUSTING INSTRUCTION THREAD PRIORITY IN A MULTI-THREAD PROCESSOR
    78.
    发明申请
    APPARATUS FOR ADJUSTING INSTRUCTION THREAD PRIORITY IN A MULTI-THREAD PROCESSOR 有权
    用于调整多线程处理器中的指令优先级的设备

    公开(公告)号:US20080155233A1

    公开(公告)日:2008-06-26

    申请号:US12044846

    申请日:2008-03-07

    IPC分类号: G06F9/30

    CPC分类号: G06F9/4818 G06F9/3851

    摘要: Each instruction thread in a SMT processor is associated with a software assigned base input processing priority. Unless some predefined event or circumstance occurs with an instruction being processed or to be processed, the base input processing priorities of the respective threads are used to determine the interleave frequency between the threads according to some instruction interleave rule. However, upon the occurrence of some predefined event or circumstance in the processor related to a particular instruction thread, the base input processing priority of one or more instruction threads is adjusted to produce one more adjusted priority values. The instruction interleave rule is then enforced according to the adjusted priority value or values together with any base input processing priority values that have not been subject to adjustment.

    摘要翻译: SMT处理器中的每个指令线程与软件分配的基本输入处理优先级相关联。 除非正在处理或要处理的指令发生一些预定义的事件或情况,否则各个线程的基本输入处理优先级用于根据某种指令交错规则来确定线程之间的交织频率。 然而,在与特定指令线程相关的处理器中发生某些预定义的事件或环境时,调整一个或多个指令线程的基本输入处理优先级以产生一个更多调整的优先级值。 然后根据调整后的优先级值或与未经调整的任何基本输入处理优先级值一起实施指令交错规则。

    Cache residence prediction
    79.
    发明授权
    Cache residence prediction 失效
    缓存居住预测

    公开(公告)号:US07266642B2

    公开(公告)日:2007-09-04

    申请号:US10779999

    申请日:2004-02-17

    IPC分类号: G06F12/00

    摘要: The present invention proposes a novel cache residence prediction mechanism that predicts whether requested data of a cache miss can be found in another cache. The memory controller can use the prediction result to determine if it should immediately initiate a memory access, or initiate no memory access until a cache snoop response shows that the requested data cannot be supplied by a cache.The cache residence prediction mechanism can be implemented at the cache side, the memory side, or both. A cache-side prediction mechanism can predict that data requested by a cache miss can be found in another cache if the cache miss address matches an address tag of a cache line in the requesting cache and the cache line is in an invalid state. A memory-side prediction mechanism can make effective prediction based on observed memory and cache operations that are recorded in a prediction table.

    摘要翻译: 本发明提出了一种新颖的缓存驻留预测机制,其预测是否可以在另一个高速缓存中找到所请求的高速缓存未命中的数据。 存储器控制器可以使用预测结果来确定它是否应该立即启动存储器访问,或者不启动存储器访问,直到高速缓存侦听响应显示所请求的数据不能被高速缓存提供。 缓存驻留预测机制可以在高速缓存侧,存储器侧或两者中实现。 如果高速缓存未命中地址与请求高速缓存中的高速缓存线的地址标签匹配并且高速缓存行处于无效状态,则高速缓存侧预测机制可以预测在另一个高速缓存中可以找到由高速缓存未命中请求的数据。 存储器侧预测机制可以基于记录在预测表中的观察到的存储器和高速缓存操作来进行有效的预测。

    Method and apparatus for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit
    80.
    发明申请
    Method and apparatus for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit 失效
    用于使用多态功能单元在微处理器中的架构单元之间共享存储和执行资源的方法和装置

    公开(公告)号:US20070162726A1

    公开(公告)日:2007-07-12

    申请号:US11329320

    申请日:2006-01-10

    IPC分类号: G06F9/40

    摘要: Methods and apparatus are provided for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit. A method for executing instructions in a processor having a polymorphic execution unit includes the steps of reloading a state associated with a first instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the first instruction class, when an instruction of the first instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with a second instruction class. The method also includes the steps of reloading a state associated with a second instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the second instruction class, when an instruction of the second instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with the first instruction class.

    摘要翻译: 提供了用于使用多态功能单元在微处理器中的架构单元之间共享存储和执行资源的方法和装置。 一种用于在具有多态执行单元的处理器中执行指令的方法,包括以下步骤:当所述第一指令类的指令时,重新加载与第一指令类相关联的状态并重新配置所述多态执行单元以根据所述第一指令类进行操作 并且多态执行单元被配置为根据第二指令类进行操作。 该方法还包括以下步骤:当遇到第二指令类的指令时,重新加载与第二指令类相关联的状态并重新配置多项式执行单元以根据第二指令类进行操作,并且将多态执行单元配置为 按照第一个指导班进行操作。