Power-aware line intervention for a multiprocessor snoop coherency protocol
    11.
    发明授权
    Power-aware line intervention for a multiprocessor snoop coherency protocol 有权
    多处理器侦听一致性协议的功率感知线路干预

    公开(公告)号:US07870337B2

    公开(公告)日:2011-01-11

    申请号:US11946249

    申请日:2007-11-28

    摘要: A snoop coherency method, system and program are provided for intervening a requested cache line from a plurality of candidate memory sources in a multiprocessor system on the basis of the sensed temperature or power dissipation value at each memory source. By providing temperature or power dissipation sensors in each of the candidate memory sources (e.g., at cores, cache memories, memory controller, etc.) that share a requested cache line, control logic may be used to determine which memory source should source the cache line by using the power sensor signals to signal only the memory source with acceptable power dissipation to provide the cache line to the requester.

    摘要翻译: 提供窥探一致性方法,系统和程序,用于基于每个存储器源处的感测温度或功率耗散值,在多处理器系统中从多个候选存储器源插入所请求的高速缓存行。 通过在共享所请求的高速缓存行的每个候选存储器源(例如,在内核,高速缓冲存储器,存储器控制器等)中提供温度或功率耗散传感器,可以使用控制逻辑来确定哪个存储器源应该来源于高速缓存 通过使用功率传感器信号仅以可接受的功率消耗信号通知存储器源,以向请求器提供高速缓存线。

    Cache member protection with partial make MRU allocation
    12.
    发明授权
    Cache member protection with partial make MRU allocation 失效
    缓存成员保护部分使MRU分配

    公开(公告)号:US07689777B2

    公开(公告)日:2010-03-30

    申请号:US11951770

    申请日:2007-12-06

    IPC分类号: G06F12/00

    摘要: A method and apparatus for enabling protection of a particular member of a cache during LRU victim selection. LRU state array includes additional “protection” bits in addition to the state bits. The protection bits serve as a pointer to identify the location of the member of the congruence class that is to be protected. A protected member is not removed from the cache during standard LRU victim selection, unless that member is invalid. The protection bits are pipelined to MRU update logic, where they are used to generate an MRU vector. The particular member identified by the MRU vector (and pointer) is protected from selection as the next LRU victim, unless the member is Invalid. The make MRU operation affects only the lower level LRU state bits arranged a tree-based structure and thus only negates the selection of the protected member, without affecting LRU victim selection of the other members.

    摘要翻译: 一种用于在LRU受害者选择期间能够保护缓存的特定成员的方法和装置。 LRU状态阵列除了状态位之外还包括额外的“保护”位。 保护位用作用于标识要保护的同余类的成员的位置的指针。 在标准LRU受害者选择期间,保护成员不会从缓存中删除,除非该成员无效。 保护位被流水线到MRU更新逻辑,它们用于生成MRU向量。 由MRU向量(和指针)标识的特定成员不被选择作为下一个LRU受害者,除非成员无效。 使MRU操作仅影响布置了基于树的结构的较低级LRU状态位,并且因此仅在不影响其他成员的LRU受害者选择的情况下,否定受保护成员的选择。

    Pipelining D states for MRU steerage during MRU/LRU member allocation
    13.
    发明授权
    Pipelining D states for MRU steerage during MRU/LRU member allocation 失效
    在MRU / LRU成员分配过程中,管理MRU操纵的D状态

    公开(公告)号:US07401189B2

    公开(公告)日:2008-07-15

    申请号:US11054067

    申请日:2005-02-09

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    摘要: A method and apparatus for preventing selection of Deleted (D) members as an LRU victim during LRU victim selection. During each cache access targeting the particular congruence class, the deleted cache line is identified from information in the cache directory. A location of a deleted cache line is pipelined through the cache architecture during LRU victim selection. The information is latched and then passed to MRU vector generation logic. An MRU vector is generated and passed to the MRU update logic, which is selects/tags the deleted member as a MRU member. The make MRU operation affects only the lower level LRU state bits arranged in a tree-based structure state bits so that the make MRU operation only negates selection of the specific member in the D state, without affecting LRU victim selection of the other members.

    摘要翻译: 用于在LRU受害者选择期间防止选择被删除(D)成员作为LRU受害者的方法和装置。 在针对特定同余类的每个缓存访问期间,从高速缓存目录中的信息识别已删除的高速缓存行。 删除的高速缓存行的位置在LRU受害者选择期间通过高速缓存架构流水线化。 信息被锁存,然后传递给MRU向量生成逻辑。 生成MRU向量并将其传递给MRU更新逻辑,MRU更新逻辑是将删除的成员作为MRU成员进行选择/标记。 使MRU操作仅影响以基于树的结构状态位布置的较低级LRU状态位,使得MRU操作仅在D状态下否定特定成员的选择,而不影响其他成员的LRU受害者选择。

    Performance of emerging applications in a virtualized environment using transient instruction streams
    16.
    发明授权
    Performance of emerging applications in a virtualized environment using transient instruction streams 有权
    使用瞬态指令流在虚拟化环境中的新兴应用程序的性能

    公开(公告)号:US09323527B2

    公开(公告)日:2016-04-26

    申请号:US12905208

    申请日:2010-10-15

    IPC分类号: G06F9/30 G06F9/38

    摘要: A method, system and computer-usable medium are disclosed for managing transient instruction streams. Transient flags are defined in Branch-and-Link (BRL) instructions that are known to be infrequently executed. A bit is likewise set in a Special Purpose Register (SPR) of the hardware (e.g., a core) that is executing an instruction request thread. Subsequent fetches or prefetches in the request thread are treated as transient and are not written to lower-level caches. If an instruction is non-transient, and if a lower-level cache is non-inclusive of the L1 instruction cache, a fetch or prefetch miss that is obtained from memory may be written in both the L1 and the lower-level cache. If it is not inclusive, a cast-out from the L1 instruction cache may be written in the lower-level cache.

    摘要翻译: 公开了用于管理瞬时指令流的方法,系统和计算机可用介质。 在已知很少执行的分支和链路(BRL)指令中定义了瞬态标志。 在执行指令请求线程的硬件(例如,核心)的专用寄存器(SPR)中同样设置一个位。 请求线程中的后续提取或预取将被视为暂时的,并且不会写入低级缓存。 如果指令是非瞬态的,并且如果低级缓存不包括L1指令高速缓存,则从存储器获得的获取或预取缺失可以被写入L1和下级高速缓存中。 如果不包括在内,则可以将低速缓存中的L1指令高速缓存中的退出写入。

    Memory databus utilization management system and computer program product
    17.
    发明授权
    Memory databus utilization management system and computer program product 有权
    内存数据总线利用管理系统和计算机程序产品

    公开(公告)号:US08898674B2

    公开(公告)日:2014-11-25

    申请号:US12645768

    申请日:2009-12-23

    IPC分类号: G06F9/46 G06F13/16 G06F9/50

    摘要: According to one aspect of the present disclosure a system and computer program product for managing memory access is disclosed. The system includes a plurality of memory controllers each configured to maintain memory databus utilization by a corresponding processor at or below a threshold to maintain memory databus utilization of the system at or below a system threshold. The system also includes a service processor configured to receive memory databus utilization data from the memory controllers and programmed to, in response to determining that memory databus utilization for at least one of the processors is below its threshold, reallocate at least a portion of unused databus utilization from the at least one processor to at least one of the other processors.

    摘要翻译: 根据本公开的一个方面,公开了一种用于管理存储器访问的系统和计算机程序产品。 该系统包括多个存储器控制器,每个存储器控制器被配置为在阈值以下维持相应的处理器的存储器数据总线利用率,以将系统的存储器数据总线利用率维持在或低于系统阈值。 该系统还包括服务处理器,其被配置为从存储器控制器接收存储器数据总线利用数据,并且被编程为响应于确定至少一个处理器的存储器数据总线利用率低于其阈值,重新分配至少一部分未使用的数据总线 从至少一个处理器到至少一个其他处理器的利用。

    Two-level representative workload phase detection
    18.
    发明授权
    Two-level representative workload phase detection 有权
    两级代表性工作负载相位检测

    公开(公告)号:US08245084B2

    公开(公告)日:2012-08-14

    申请号:US11972678

    申请日:2008-01-11

    IPC分类号: G06F11/00

    摘要: A subset of a workload, which includes a total set of dynamic instructions, is identified to use as a trace. Processor unit hardware executes the entire workload in real-time using a particular dataset. The processor unit hardware includes at least one microprocessor and at least one cache. The real-time execution of the workload is monitored to obtain information about how the processor unit hardware executes the workload when the workload is executed using the particular dataset to form actual performance information. Multiple different subsets of the workload are generated. The execution of each one of the subsets by the processor unit hardware is compared with the actual performance information. A result of the comparison is used to select one of the plurality of different subsets that most closely represents the execution of the entire workload using the particular dataset to use as a trace.

    摘要翻译: 工作负载的一个子集(其中包含一整套动态指令)被识别为跟踪。 处理器单元硬件使用特定数据集实时执行整个工作负载。 处理器单元硬件包括至少一个微处理器和至少一个高速缓存。 监视工作负载的实时执行,以获取有关当使用特定数据集执行工作负载以形成实际性能信息时处理器单元硬件如何执行工作负载的信息。 生成多个不同的工作负载子集。 将处理器单元硬件的每个子集的执行与实际的性能信息进行比较。 使用比较的结果来选择使用特定数据集作为跟踪最接近地表示整个工作负荷的执行的多个不同子集中的一个。

    Augmenting of automated clustering-based trace sampling methods by user-directed phase detection
    19.
    发明授权
    Augmenting of automated clustering-based trace sampling methods by user-directed phase detection 有权
    通过用户导向的相位检测来增强基于自动聚类的跟踪采样方法

    公开(公告)号:US08000953B2

    公开(公告)日:2011-08-16

    申请号:US11842337

    申请日:2007-08-21

    IPC分类号: G06F9/45

    摘要: Computer implemented method, system, and computer usable program code for simulating processor operation in a data processing system. An instruction trace is generated, wherein the instruction trace includes markers specified by a user for identifying interval boundaries for at least one interval of the instruction trace. The instruction trace is divided into a plurality of intervals in consideration of the markers, and the plurality of intervals are formed into a plurality of interval clusters, wherein each interval cluster represents one phase of execution of the instruction trace. At least one interval from each of the plurality of interval clusters is selected as a trace sample to provide a plurality of trace samples, wherein each selected interval is of at least a minimum size, a simulation is performed using the plurality of trace samples, and a result of the simulation is provided to the user.

    摘要翻译: 用于在数据处理系统中模拟处理器操作的计算机实现的方法,系统和计算机可用程序代码。 生成指令轨迹,其中指令轨迹包括由用户指定的用于识别指令轨迹的至少一个间隔的间隔边界的标记。 考虑到标记,指令轨迹被分成多个间隔,并且多个间隔被形成为多个间隔簇,其中每个间隔簇表示指令轨迹的执行的一个阶段。 选择来自多个间隔群集中的每一个的至少一个间隔作为跟踪样本以提供多个跟踪样本,其中每个选择的间隔至少为最小尺寸,使用多个迹线样本进行模拟,以及 向用户提供模拟的结果。

    Apparatus and method to manage power in a computing device
    20.
    发明授权
    Apparatus and method to manage power in a computing device 失效
    在计算设备中管理功率的装置和方法

    公开(公告)号:US07900071B2

    公开(公告)日:2011-03-01

    申请号:US12031543

    申请日:2008-02-14

    IPC分类号: G06F1/32

    CPC分类号: G06F1/206 Y02D10/16

    摘要: A method to manage power in a computing device comprising a controller assembly and a storage assembly comprising a plurality of data storage devices, by selecting a processor parameter, establishing a threshold processor parameter value, establishing a threshold over-parameter time interval, selecting a data storage device parameter, and establishing a nominal data storage device parameter value. The method determines an actual processor parameter value. If the actual processor parameter value is less than or equal to the threshold processor parameter value, the method operates each of the plurality of data storage devices using the nominal data storage device parameter value. If the actual processor parameter value is greater than the threshold processor parameter value, then the method determines an actual over-parameter time interval. If the actual processor parameter value is greater than the threshold processor parameter value, and if the actual over-parameter time interval is greater than the threshold over-parameter time interval, then the method operates each of the plurality of data storage devices using a data storage device parameter value less than the nominal data storage device parameter value.

    摘要翻译: 一种通过选择处理器参数,建立阈值处理器参数值,建立阈值超参数时间间隔,选择数据的方法来管理计算设备中的功率的方法,包括控制器组件和包括多个数据存储设备的存储组件 存储设备参数,并建立标称数据存储设备参数值。 该方法确定实际的处理器参数值。 如果实际处理器参数值小于或等于阈值处理器参数值,则该方法使用标称数据存储设备参数值来操作多个数据存储设备中的每一个。 如果实际处理器参数值大于阈值处理器参数值,则该方法确定实际的超参数时间间隔。 如果实际处理器参数值大于阈值处理器参数值,并且如果实际超参数时间间隔大于阈值过参数时间间隔,则该方法使用数据来操作多个数据存储设备中的每一个 存储设备参数值小于额定数据存储设备参数值。