L2 CACHE ARRAY TOPOLOGY FOR LARGE CACHE WITH DIFFERENT LATENCY DOMAINS
    61.
    发明申请
    L2 CACHE ARRAY TOPOLOGY FOR LARGE CACHE WITH DIFFERENT LATENCY DOMAINS 有权
    L2缓存高速缓存的区别于不同的域名

    公开(公告)号:US20080077740A1

    公开(公告)日:2008-03-27

    申请号:US11947742

    申请日:2007-11-29

    IPC分类号: G06F12/08

    摘要: A cache memory logically associates a cache line with at least two cache sectors of a cache array wherein different sectors have different output latencies and, for a load hit, selectively enables the cache sectors based on their latency to output the cache line over successive clock cycles. Larger wires having a higher transmission speed are preferably used to output the cache line corresponding to the requested memory block. In the illustrative embodiment the cache is arranged with rows and columns of the cache sectors, and a given cache line is spread across sectors in different columns, with at least one portion of the given cache line being located in a first column having a first latency, and another portion of the given cache line being located in a second column having a second latency greater than the first latency. One set of wires oriented along a horizontal direction may be used to output the cache line, while another set of wires oriented along a vertical direction may be used for maintenance of the cache sectors. A given cache line is further preferably spread across sectors in different rows or cache ways. For example, a cache line can be 128 bytes and spread across four sectors in four different columns, each sector containing 32 bytes of the cache line, and the cache line is output over four successive clock cycles with one sector being transmitted during each of the four cycles.

    摘要翻译: 缓存存储器逻辑地将高速缓存行与高速缓存阵列的至少两个缓存扇区相关联,其中不同扇区具有不同的输出延迟,并且对于负载命中,基于它们的等待时间来选择性地启用高速缓存扇区以在连续的时钟周期上输出高速缓存行 。 优选使用具有较高传输速度的较大导线来输出与所请求的存储块相对应的高速缓存行。 在说明性实施例中,高速缓存器配置有高速缓存扇区的行和列,并且给定的高速缓存行分布在不同列中的扇区之间,其中给定高速缓存行的至少一部分位于具有第一等待时间的第一列中 并且所述给定高速缓存行的另一部分位于具有大于所述第一等待时间的第二等待时间的第二列中。 可以使用沿水平方向定向的一组线来输出高速缓存线,而沿着垂直方向定向的另一组线可以用于高速缓存扇区的维护。 给定的高速缓存行进一步优选地分布在不同行或高速缓存方式的扇区之间。 例如,高速缓存行可以是128字节并且分布在四个不同列中的四个扇区上,每个扇区包含32个字节的高速缓存行,并且高速缓存行在四个连续的时钟周期内被输出,在每个 四个周期。

    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT SUPPORT MEMORY ACCESS ACCORDING TO DIVERSE MEMORY MODELS
    62.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT SUPPORT MEMORY ACCESS ACCORDING TO DIVERSE MEMORY MODELS 失效
    数据处理系统,处理器和数据处理方法,支持根据多个存储器模型的存储器访问

    公开(公告)号:US20070250668A1

    公开(公告)日:2007-10-25

    申请号:US11380018

    申请日:2006-04-25

    IPC分类号: G06F13/00

    摘要: A data processing system includes a memory subsystem and an execution unit, coupled to the memory subsystem, which executes store instructions to determine target memory addresses of store operations to be performed by the memory subsystem. The data processing system further includes a mode field having a first setting indicating strong ordering between store operations and a second setting indicating weak ordering between store operations. Store operations accessing the memory subsystem are associated with either the first setting or the second setting. The data processing system also includes logic that, based upon settings of the mode field, inserts a synchronizing operation between a store operation associated with the first setting and a store operation associated with the second setting, such that all store operations preceding the synchronizing operation complete before store operations subsequent to the synchronizing operation.

    摘要翻译: 数据处理系统包括存储器子系统和执行单元,其耦合到存储器子系统,其执行存储指令以确定要由存储器子系统执行的存储操作的目标存储器地址。 数据处理系统还包括具有指示存储操作之间的强顺序的第一设置的模式字段和指示存储操作之间的弱顺序的第二设置。 访问内存子系统的存储操作与第一个设置或第二个设置相关联。 数据处理系统还包括基于模式字段的设置的逻辑,在与第一设置相关联的存储操作与与第二设置相关联的存储操作之间插入同步操作,使得同步操作之前的所有存储操作完成 在同步操作之后的存储操作之前。

    Data processing system, method and interconnect fabric having a flow governor
    63.
    发明申请
    Data processing system, method and interconnect fabric having a flow governor 有权
    具有流量调节器的数据处理系统,方法和互连结构

    公开(公告)号:US20060187958A1

    公开(公告)日:2006-08-24

    申请号:US11055399

    申请日:2005-02-10

    IPC分类号: H04J3/22 H04J3/16

    摘要: A data processing system includes a plurality of local hubs each coupled to a remote hub by a respective one a plurality of point-to-point communication links. Each of the plurality of local hubs queues requests for access to memory blocks for transmission on a respective one of the point-to-point communication links to a shared resource in the remote hub. Each of the plurality of local hubs transmits requests to the remote hub utilizing only a fractional portion of a bandwidth of its respective point-to-point communication link. The fractional portion that is utilized is determined by an allocation policy based at least in part upon a number of the plurality of local hubs and a number of processing units represented by each of the plurality of local hubs. The allocation policy prevents overruns of the shared resource.

    摘要翻译: 数据处理系统包括多个本地集线器,每个集线器通过相应的一个多个点对点通信链路耦合到远程集线器。 多个本地集线器中的每一个排队对存储器块进行访问的请求,用于在到远程集线器中的共享资源的点对点通信链路中的相应一个上传输。 多个本地集线器中的每一个仅利用其相应点对点通信链路的带宽的小数部分向远程集线器发送请求。 所使用的分数部分由至少部分地基于多个本地集线器的数量和由多个本地集线器中的每一个表示的多个处理单元的分配策略确定。 分配策略可以防止超出共享资源。

    L2 cache array topology for large cache with different latency domains
    64.
    发明申请
    L2 cache array topology for large cache with different latency domains 有权
    具有不同延迟域的大型缓存的L2缓存阵列拓扑

    公开(公告)号:US20060179223A1

    公开(公告)日:2006-08-10

    申请号:US11054930

    申请日:2005-02-10

    IPC分类号: G06F12/00

    摘要: A cache memory logically associates a cache line with at least two cache sectors of a cache array wherein different sectors have different output latencies and, for a load hit, selectively enables the cache sectors based on their latency to output the cache line over successive clock cycles. Larger wires having a higher transmission speed are preferably used to output the cache line corresponding to the requested memory block. In the illustrative embodiment the cache is arranged with rows and columns of the cache sectors, and a given cache line is spread across sectors in different columns, with at least one portion of the given cache line being located in a first column having a first latency, and another portion of the given cache line being located in a second column having a second latency greater than the first latency. One set of wires oriented along a horizontal direction may be used to output the cache line, while another set of wires oriented along a vertical direction may be used for maintenance of the cache sectors. A given cache line is further preferably spread across sectors in different rows or cache ways. For example, a cache line can be 128 bytes and spread across four sectors in four different columns, each sector containing 32 bytes of the cache line, and the cache line is output over four successive clock cycles with one sector being transmitted during each of the four cycles.

    摘要翻译: 缓存存储器逻辑地将高速缓存行与高速缓存阵列的至少两个缓存扇区相关联,其中不同扇区具有不同的输出延迟,并且对于负载命中,基于它们的等待时间来选择性地启用高速缓存扇区以在连续的时钟周期上输出高速缓存行 。 优选使用具有较高传输速度的较大导线来输出与所请求的存储块相对应的高速缓存行。 在说明性实施例中,高速缓存器配置有高速缓存扇区的行和列,并且给定的高速缓存行分布在不同列中的扇区之间,其中给定高速缓存行的至少一部分位于具有第一等待时间的第一列中 并且所述给定高速缓存行的另一部分位于具有大于所述第一等待时间的第二等待时间的第二列中。 可以使用沿水平方向定向的一组线来输出高速缓存线,而沿着垂直方向定向的另一组线可以用于高速缓存扇区的维护。 给定的高速缓存行进一步优选地分布在不同行或高速缓存方式的扇区之间。 例如,高速缓存行可以是128字节并且分布在四个不同列中的四个扇区上,每个扇区包含32个字节的高速缓存行,并且高速缓存行在四个连续的时钟周期内被输出,在每个 四个周期。

    Data processing system and method for selecting a scope of broadcast of an operation by reference to a translation table
    65.
    发明申请
    Data processing system and method for selecting a scope of broadcast of an operation by reference to a translation table 审中-公开
    参考翻译表选择操作的广播范围的数据处理系统和方法

    公开(公告)号:US20070168639A1

    公开(公告)日:2007-07-19

    申请号:US11333607

    申请日:2006-01-17

    IPC分类号: G06F12/00

    摘要: A data processing system includes at least first and second coherency domains coupled by an interconnect fabric. A memory coupled to the interconnect fabric includes an address translation table having a translation table entry utilized to translate virtual memory addresses to real memory addresses. The translation table entry also includes scope information for broadcast operations targeting addresses within a memory region associated with the translation table entry. Scope prediction logic within the first coherency domain predictively selects a scope of broadcast of an operation on an interconnect fabric of the data processing system by reference to the scope information within the address translation table entry.

    摘要翻译: 数据处理系统至少包括由互连结构耦合的第一和第二相干域。 耦合到互连结构的存储器包括具有用于将虚拟存储器地址转换为实际存储器地址的转换表项的地址转换表。 转换表条目还包括针对与转换表条目相关联的存储器区域内的地址的广播操作的范围信息。 第一相干域内的范围预测逻辑通过参考地址转换表条目中的范围信息预测性地选择数据处理系统的互连结构上的操作的广播范围。

    Data processing system and method for efficient communication utilizing an In coherency state
    66.
    发明申请
    Data processing system and method for efficient communication utilizing an In coherency state 有权
    数据处理系统和利用一致性状态的高效通信方法

    公开(公告)号:US20060179252A1

    公开(公告)日:2006-08-10

    申请号:US11055305

    申请日:2005-02-10

    IPC分类号: G06F13/28

    摘要: A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory, and the second coherency domain includes a coherent second cache memory. The first cache memory within the first coherency domain of the data processing system holds a memory block in a storage location associated with an address tag and a coherency state field. The coherency state field is set to a state that indicates that the address tag is valid, that the storage location does not contain valid data, and that the memory block is likely cached only within the first coherency domain.

    摘要翻译: 高速缓存一致数据处理系统至少包括第一和第二相关域,每个域包括至少一个处理单元。 第一相关域包括第一高速缓冲存储器,并且第二相干域包括相干第二高速缓冲存储器。 数据处理系统的第一相干域内的第一高速缓冲存储器在与地址标签和一致性状态字段相关联的存储位置中保存存储器块。 相关性状态字段被设置为指示地址标签有效的状态,存储位置不包含有效数据,并且该存储器块可能仅在第一相干域内被缓存。

    Method, apparatus, and computer program product in a processor for dynamically during runtime allocating memory for in-memory hardware tracing
    67.
    发明申请
    Method, apparatus, and computer program product in a processor for dynamically during runtime allocating memory for in-memory hardware tracing 失效
    处理器中的方法,装置和计算机程序产品在运行时动态地分配用于存储器内硬件跟踪的存储器

    公开(公告)号:US20060184836A1

    公开(公告)日:2006-08-17

    申请号:US11055977

    申请日:2005-02-11

    IPC分类号: G06F11/00

    摘要: A method, apparatus, and computer program product are disclosed in a processor for dynamically, during runtime, allocating memory for in-memory hardware tracing. The processor is included within a data processing system. The processor includes multiple processing units that are coupled together utilizing a system bus. The processing units include a memory controller that controls a system memory. A particular size of the system memory is determined that is needed for storing trace data. A hardware trace facility requests, dynamically after the data processing system has completed booting, the particular size of the system memory to be allocated to the hardware trace facility for storing trace data that is captured by the hardware trace facility. The firmware selects particular locations within the system memory. All of the particular locations together are the particular size. The firmware allocates the particular locations for use exclusively by the hardware trace facility.

    摘要翻译: 在处理器中公开了一种方法,装置和计算机程序产品,用于在运行时期间动态地为存储器内硬件跟踪分配存储器。 处理器包含在数据处理系统中。 处理器包括使用系统总线耦合在一起的多个处理单元。 处理单元包括控制系统存储器的存储器控​​制器。 确定存储跟踪数据所需的特定大小的系统存储器。 硬件跟踪设施在数据处理系统完成启动之后动态地请求要分配给硬件跟踪设备的系统内存的特定大小,用于存储由硬件跟踪设备捕获的跟踪数据。 固件选择系统内存中的特定位置。 所有特定位置在一起是特定的尺寸。 固件分配由硬件跟踪设备专门使用的特定位置。

    Store stream prefetching in a microprocessor
    68.
    发明申请
    Store stream prefetching in a microprocessor 失效
    在微处理器中存储流预取

    公开(公告)号:US20060179238A1

    公开(公告)日:2006-08-10

    申请号:US11054871

    申请日:2005-02-10

    IPC分类号: G06F13/28

    摘要: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.

    摘要翻译: 在具有加载/存储单元和预取硬件的微处理器中,预取硬件包括预取队列,其包含指示分配的数据流的条目。 预取引擎接收与由加载/存储单元执行的存储指令相关联的地址。 预取引擎通过将队列中的条目与包含多个高速缓存块的地址的窗口进行比较来确定是否对与存储指令相对应的预取队列中的条目进行分配,其中地址窗口从接收到的地址导出。 预取引擎将预取队列中的条目与两个连续高速缓存块的窗口进行比较。 当预取队列中的任何条目都在地址窗口内时,预取引擎抑制新条目的分配。 当存储指令的数据地址等于地址窗口的边界区域中的地址时,预取引擎进一步抑制新条目的分配。

    Method, apparatus, and computer program product in a processor for concurrently sharing a memory controller among a tracing process and non-tracing processes using a programmable variable number of shared memory write buffers
    69.
    发明申请
    Method, apparatus, and computer program product in a processor for concurrently sharing a memory controller among a tracing process and non-tracing processes using a programmable variable number of shared memory write buffers 失效
    处理器中的方法,装置和计算机程序产品,用于在跟踪过程和使用可编程可变数量的共享存储器写入缓冲器的非跟踪处理之间并发共享存储器控制器

    公开(公告)号:US20060184834A1

    公开(公告)日:2006-08-17

    申请号:US11055845

    申请日:2005-02-11

    IPC分类号: G06F11/00

    CPC分类号: G06F11/2268 G06F11/348

    摘要: A method, apparatus, and computer program product are disclosed for, in a processor, concurrently sharing a memory controller among a tracing process and non-tracing processes using a programmable variable number of shared memory write buffers. A hardware trace facility captures hardware trace data in a processor. The hardware trace facility is included within the processor. The hardware trace data is transmitted to a system memory utilizing a system bus. The system memory is included within the system. The system bus is capable of being utilized by processing units included in the processing node while the hardware trace data is being transmitted to the system bus. Part of system memory is utilized to store the trace data. The system memory is capable of being accessed by processing units in the processing node other than the hardware trace facility while part of the system memory is being utilized to store the trace data.

    摘要翻译: 公开了一种方法,装置和计算机程序产品,用于在处理器中使用可编程可变数量的共享存储器写缓冲器在跟踪处理和非跟踪处理之间共享存储器控制器。 硬件跟踪设备捕获处理器中的硬件跟踪数据。 硬件跟踪工具包含在处理器内。 使用系统总线将硬件跟踪数据传输到系统存储器。 系统内存包含在系统中。 当将硬件跟踪数据发送到系统总线时,系统总线能够被包括在处理节点中的处理单元利用。 系统内存的一部分用于存储跟踪数据。 系统存储器能够被处理节点除硬件跟踪设备之外的处理单元访问,同时系统存储器的一部分用于存储跟踪数据。

    Method, apparatus, and computer program product in a processor for performing in-memory tracing using existing communication paths
    70.
    发明申请
    Method, apparatus, and computer program product in a processor for performing in-memory tracing using existing communication paths 失效
    处理器中的方法,装置和计算机程序产品,用于使用现有通信路径执行内存中跟踪

    公开(公告)号:US20060184833A1

    公开(公告)日:2006-08-17

    申请号:US11055821

    申请日:2005-02-11

    IPC分类号: G06F11/00

    CPC分类号: G06F11/2268 G06F11/348

    摘要: A method, apparatus, and computer program product are disclosed for performing in-memory hardware tracing in a processor using an existing system bus. The processor includes multiple processing units that are coupled together utilizing the system bus. The processing units include a memory controller that controls a system memory. Information is transmitted among the processing units utilizing the system bus. The information is formatted according to a standard system bus protocol. Hardware trace data is captured utilizing a hardware trace facility that is coupled directly to the system bus. The system bus is utilized for transmitting the hardware trace data to the memory controller for storage in the system memory. The memory controller is coupled directly to the system bus. The hardware trace data is formatted according to the standard system bus protocol for transmission via the system bus.

    摘要翻译: 公开了一种用于使用现有系统总线在处理器中执行存储器内硬件跟踪的方法,装置和计算机程序产品。 处理器包括利用系统总线耦合在一起的多个处理单元。 处理单元包括控制系统存储器的存储器控​​制器。 利用系统总线在处理单元之间传送信息。 信息根据标准系统总线协议进行格式化。 使用直接耦合到系统总线的硬件跟踪功能来捕获硬件跟踪数据。 系统总线用于将硬件跟踪数据发送到存储器控制器以存储在系统存储器中。 存储器控制器直接耦合到系统总线。 硬件跟踪数据根据标准系统总线协议进行格式化,以便通过系统总线进行传输。