-
71.
公开(公告)号:US20110078697A1
公开(公告)日:2011-03-31
申请号:US12571200
申请日:2009-09-30
IPC分类号: G06F9/46
CPC分类号: G06F9/3836 , G06F9/3838 , G06F9/384 , G06F9/3842 , G06F9/3851 , G06F9/3857
摘要: Systems and methods for efficient out-of-order dynamic deallocation of entries within a shared storage resource in a processor. A processor comprises a unified pick queue that includes an array configured to dynamically allocate any entry of a plurality of entries for a decoded and renamed instruction. This instruction may correspond to any available active threads supported by the processor. The processor includes circuitry configured to determine whether an instruction corresponding to an allocated entry of the plurality of entries is dependent on a speculative instruction and whether the instruction has a fixed instruction execution latency. In response to determining the instruction is not dependent on a speculative instruction, the instruction has a fixed instruction execution latency, and said latency has transpired, the circuitry may deallocate the instruction from the allocated entry.
摘要翻译: 用于处理器中共享存储资源内的条目的有效无序动态释放的系统和方法。 处理器包括统一选择队列,其包括被配置为动态分配用于解码和重命名指令的多个条目的任何条目的阵列。 该指令可对应于处理器支持的任何可用的活动线程。 所述处理器包括被配置为确定与所述多个条目中所分配的条目相对应的指令是否取决于推测指令以及所述指令是否具有固定指令执行等待时间的电路。 响应于确定指令不依赖于推测性指令,指令具有固定的指令执行延迟,并且所述等待时间已经发生,电路可能从分配的条目释放指令。
-
公开(公告)号:US20100332787A1
公开(公告)日:2010-12-30
申请号:US12493941
申请日:2009-06-29
申请人: Gregory F. Grohoski , Paul J. Jordan , Mark A. Luttrell , Zeid Hartuon Samoail , Robert T. Golla
发明人: Gregory F. Grohoski , Paul J. Jordan , Mark A. Luttrell , Zeid Hartuon Samoail , Robert T. Golla
CPC分类号: G06F12/1027 , G06F2212/684
摘要: A system and method for servicing translation lookaside buffer (TLB) misses may manage separate input and output pipelines within a memory management unit. A pending request queue (PRQ) in the input pipeline may include an instruction-related portion storing entries for instruction TLB (ITLB) misses and a data-related portion storing entries for potential or actual data TLB (DTLB) misses. A DTLB PRQ entry may be allocated to each load/store instruction selected from the pick queue. The system may select an ITLB- or DTLB-related entry for servicing dependent on prior PRQ entry selection(s). A corresponding entry may be held in a translation table entry return queue (TTERQ) in the output pipeline until a matching address translation is received from system memory. PRQ and/or TTERQ entries may be deallocated when a corresponding TLB miss is serviced. PRQ and/or TTERQ entries associated with a thread may be deallocated in response to a thread flush.
摘要翻译: 用于服务翻译后备缓冲器(TLB)的系统和方法可以管理存储器管理单元内的单独的输入和输出管线。 输入流水线中的未决请求队列(PRQ)可以包括存储用于指令TLB(ITLB)未命中的条目的指令相关部分和存储潜在或实际数据TLB(DTLB)丢失的条目的数据相关部分。 可以将DTLB PRQ条目分配给从拾取队列中选择的每个加载/存储指令。 系统可以根据先前的PRQ条目选择来选择与ITLB或DTLB相关的条目进行服务。 相应的条目可以保存在输出流水线中的转换表条目返回队列(TTERQ)中,直到从系统存储器接收到匹配的地址转换。 当服务对应的TLB未命中时,PRQ和/或TTERQ条目可以被释放。 与线程相关联的PRQ和/或TTERQ条目可以响应于线程刷新而被释放。
-
73.
公开(公告)号:US20100299499A1
公开(公告)日:2010-11-25
申请号:US12570642
申请日:2009-09-30
IPC分类号: G06F9/38
CPC分类号: G06F9/3851 , G06F9/322 , G06F9/3806 , G06F9/3824 , G06F9/3838 , G06F9/384 , G06F9/3855 , G06F9/3885
摘要: Systems and methods for efficient dynamic utilization of shared resources in a processor. A processor comprises a front end pipeline, an execution pipeline, and a commit pipeline, wherein each pipeline comprises a shared resource with entries configured to be allocated for use in each clock cycle by each of a plurality of threads supported by the processor. To avoid starvation of any active thread, the processor further comprises circuitry configured to ensure each active thread is able to allocate at least a predetermined quota of entries of each shared resource. Each pipe stage of a total pipeline for the processor may include at least one dynamically allocated shared resource configured not to starve any active thread. Dynamic allocation of shared resources between a plurality of threads may yield higher performance over static allocation. In addition, dynamic allocation may require relatively little overhead for activation/deactivation of threads.
摘要翻译: 处理器中共享资源的有效动态利用的系统和方法。 处理器包括前端流水线,执行流水线和提交流水线,其中每个流水线包括具有被配置为被分配供在处理器支持的多个线程中的每一个的每个时钟周期中的条目的共享资源。 为了避免任何活动线程的饥饿,处理器还包括被配置为确保每个活动线程能够分配至少预定的每个共享资源的条目配额的电路。 用于处理器的总流水线的每个管道级可以包括被配置为不使任何活动线程饿死的至少一个动态分配的共享资源。 多个线程之间的共享资源的动态分配可以产生比静态分配更高的性能。 此外,动态分配可能需要相对较少的开销用于线程的激活/去激活。
-
公开(公告)号:US20100274961A1
公开(公告)日:2010-10-28
申请号:US12428457
申请日:2009-04-22
申请人: Robert T. Golla , Jama I. Barreh , Howard L. Levy
发明人: Robert T. Golla , Jama I. Barreh , Howard L. Levy
CPC分类号: G06F9/3851 , G06F9/30112 , G06F9/3838 , G06F9/384 , G06F9/3855 , G06F9/3857 , G06F9/3877
摘要: Techniques and systems are described herein to maintain a mapping of logical to physical registers—for example, in the context of a multithreaded processor that supports renaming. A mapping unit may have a plurality of entries, each of which stores rename information for a dedicated one of a set of physical registers available to the processor for renaming. This physically-indexed mapping unit may support multiple threads, and may comprise a content-addressable memory (CAM) in certain embodiments. The mapping unit may support various combinations of read operations (to determine if a logical register is mapped to a physical register), write operations (to create or modify one or more entries containing mapping information), thread flush operations, and commit operations. More than one of such operations may be performed substantially simultaneously in certain embodiments.
摘要翻译: 这里描述了技术和系统以维持逻辑到物理寄存器的映射,例如在支持重命名的多线程处理器的上下文中。 映射单元可以具有多个条目,每个条目存储用于处理器可用于重命名的一组物理寄存器中的专用的一个的重命名信息。 该物理索引映射单元可以支持多个线程,并且在某些实施例中可以包括内容寻址存储器(CAM)。 映射单元可以支持读取操作的各种组合(以确定逻辑寄存器是否映射到物理寄存器),写入操作(用于创建或修改包含映射信息的一个或多个条目),线程刷新操作和提交操作。 在某些实施例中,可以基本上同时执行多于一个这样的操作。
-
公开(公告)号:US07778105B2
公开(公告)日:2010-08-17
申请号:US12049798
申请日:2008-03-17
申请人: Robert T. Golla , Xiang Shan Li
发明人: Robert T. Golla , Xiang Shan Li
CPC分类号: G11C7/1075 , G11C7/1078 , G11C7/1084 , G11C7/22
摘要: A memory with a write port configured for double-pump writes. The memory includes a first and second memory locations each having one or more bit cells, and one or more bit lines each coupled to corresponding ones of the bit cells. A write port is coupled to each of the bit lines. Selection circuitry, responsive to a first clock edge, latches first data from a first data path through the write port, and responsive to a second clock edge, latches second data from a second data path through the write port. A first pulse is generated during a first phase of the clock signal to cause writing of the first data into the first memory location. A second pulse is generated during a second phase of the clock signal to cause writing of the second data into the second memory location.
摘要翻译: 具有配置为双泵写入的写入端口的存储器。 存储器包括每个具有一个或多个位单元的第一和第二存储器单元,以及每个耦合到相应的位单元的一个或多个位线。 写端口耦合到每个位线。 响应于第一时钟沿的选择电路锁存来自第一数据路径的第一数据通过写入端口,并且响应于第二时钟沿,通过写入端口锁存来自第二数据路径的第二数据。 在时钟信号的第一阶段期间产生第一脉冲,以使第一数据写入第一存储器位置。 在时钟信号的第二阶段期间产生第二脉冲,以使第二数据写入第二存储器位置。
-
76.
公开(公告)号:US07330988B2
公开(公告)日:2008-02-12
申请号:US10881092
申请日:2004-06-30
CPC分类号: G06F9/3869 , G06F1/32 , G06F1/3203 , G06F9/3012 , G06F9/30123 , G06F9/3013 , G06F9/3836 , G06F9/3851 , G06F9/3857 , G06F9/3867 , G06F9/3885 , G06F9/3891
摘要: A method and apparatus for controlling power consumption in a processor. In one embodiment, a processor includes a pipeline. The pipeline includes logic for fetching instructions, issuing instructions, and executing instructions. The processor also includes a power management unit. The power management unit is configured to input M stalls into the pipeline every N instruction cycles (where M and N are integer value and wherein M is less than N).
摘要翻译: 一种用于控制处理器中的功率消耗的方法和装置。 在一个实施例中,处理器包括流水线。 流水线包括用于获取指令,发出指令和执行指令的逻辑。 处理器还包括电源管理单元。 电源管理单元被配置为每N个指令周期(其中M和N是整数值,并且其中M小于N)将M个时钟输入流水线。
-
-
-
-
-