DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT REDUCE STORE QUEUE ENTRY UTILIZATION FOR SYNCHRONIZING OPERATIONS
    1.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT REDUCE STORE QUEUE ENTRY UTILIZATION FOR SYNCHRONIZING OPERATIONS 失效
    数据处理系统,处理器和数据处理方法减少存储队列进入同步操作的使用

    公开(公告)号:US20070250669A1

    公开(公告)日:2007-10-25

    申请号:US11380020

    申请日:2006-04-25

    IPC分类号: G06F13/00

    摘要: A data processing system includes a processor core and a memory subsystem. The memory subsystem includes a store queue having a plurality of entries, where each entry includes an address field for holding the target address of store operation, a data field for holding data for the store operation, and a virtual sync field indicating a presence or absence of a synchronizing operation associated with the entry. The memory subsystem further includes a store queue controller that, responsive to receipt at the memory subsystem of a sequence of operations including a synchronizing operation and a particular store operation, places a target address and data of the particular store operation within the address field and data field, respectively, of an entry in the store queue and sets the virtual sync field of the entry to represent the synchronizing operation, such that a number of store queue entries utilized is reduced.

    摘要翻译: 数据处理系统包括处理器核心和存储器子系统。 存储器子系统包括具有多个条目的存储队列,其中每个条目包括用于保存存储操作的目标地址的地址字段,用于保存用于存储操作的数据的数据字段和指示存在或不存在的虚拟同步字段 与该条目相关联的同步操作。 存储器子系统还包括存储队列控制器,其响应于在存储器子系统处的接收包括同步操作和特定存储操作的一系列操作,将特定存储操作的目标地址和数据放置在地址字段和数据中 字段,并且设置条目的虚拟同步字段以表示同步操作,使得减少使用的存储队列条目的数量。

    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT SUPPORT MEMORY ACCESS ACCORDING TO DIVERSE MEMORY MODELS
    2.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT SUPPORT MEMORY ACCESS ACCORDING TO DIVERSE MEMORY MODELS 失效
    数据处理系统,处理器和数据处理方法,支持根据多个存储器模型的存储器访问

    公开(公告)号:US20070250668A1

    公开(公告)日:2007-10-25

    申请号:US11380018

    申请日:2006-04-25

    IPC分类号: G06F13/00

    摘要: A data processing system includes a memory subsystem and an execution unit, coupled to the memory subsystem, which executes store instructions to determine target memory addresses of store operations to be performed by the memory subsystem. The data processing system further includes a mode field having a first setting indicating strong ordering between store operations and a second setting indicating weak ordering between store operations. Store operations accessing the memory subsystem are associated with either the first setting or the second setting. The data processing system also includes logic that, based upon settings of the mode field, inserts a synchronizing operation between a store operation associated with the first setting and a store operation associated with the second setting, such that all store operations preceding the synchronizing operation complete before store operations subsequent to the synchronizing operation.

    摘要翻译: 数据处理系统包括存储器子系统和执行单元,其耦合到存储器子系统,其执行存储指令以确定要由存储器子系统执行的存储操作的目标存储器地址。 数据处理系统还包括具有指示存储操作之间的强顺序的第一设置的模式字段和指示存储操作之间的弱顺序的第二设置。 访问内存子系统的存储操作与第一个设置或第二个设置相关联。 数据处理系统还包括基于模式字段的设置的逻辑,在与第一设置相关联的存储操作与与第二设置相关联的存储操作之间插入同步操作,使得同步操作之前的所有存储操作完成 在同步操作之后的存储操作之前。

    Method for completing full cacheline stores with address-only bus operations
    3.
    发明申请
    Method for completing full cacheline stores with address-only bus operations 有权
    完成具有仅地址总线操作的完整缓存线存储的方法

    公开(公告)号:US20050251623A1

    公开(公告)日:2005-11-10

    申请号:US10825189

    申请日:2004-04-15

    IPC分类号: G06F12/00 G06F12/08

    CPC分类号: G06F12/0897 G06F12/0804

    摘要: A method and processor system that substantially eliminates data bus operations when completing updates of an entire cache line with a full store queue entry. The store queue within a processor chip is designed with a series of AND gates connecting individual bits of the byte enable bits of a corresponding entry. The AND output is fed to the STQ controller and signals when the entry is full. When full entries are selected for dispatch to the RC machines, the RC machine is signaled that the entry updates the entire cache line. The RC machine obtains write permission to the line, and then the RC machine overwrites the entire cache line. Because the entire cache line is overwritten, the data of the cache line is not retrieved when the request for the cache line misses at the cache or when data goes state before write permission is obtained by the RC machine.

    摘要翻译: 一种方法和处理器系统,其在完成具有完整存储队列条目的整个高速缓存行的更新时基本上消除数据总线操作。 处理器芯片内的存储队列设计有连接相应条目的字节使能位的各个位的一系列与门。 AND输出被馈送到STQ控制器,并在条目已满时发出信号。 当选择完整条目以发送到RC机器时,RC机器发出信号,表示该条目更新整个高速缓存行。 RC机器获得线路的写入权限,然后RC机器覆盖整个高速缓存行。 由于整个高速缓存线被覆盖,当缓存线的请求在高速缓存中丢失时或在RC机器获得写入许可之前数据进入状态时,不会检索高速缓存行的数据。

    Processor, data processing system, and method for initializing a memory block in a data processing system having multiple coherency domains
    4.
    发明申请
    Processor, data processing system, and method for initializing a memory block in a data processing system having multiple coherency domains 有权
    处理器,数据处理系统和用于初始化具有多个相干域的数据处理系统中的存储器块的方法

    公开(公告)号:US20070226423A1

    公开(公告)日:2007-09-27

    申请号:US11388001

    申请日:2006-03-23

    IPC分类号: G06F13/28

    CPC分类号: G06F12/0822 G06F12/084

    摘要: A data processing system includes at least first and second coherency domains, each including at least one processor core and a memory. In response to an initialization operation by a processor core that indicates a target memory block to be initialized, a cache memory in the first coherency domain determines a coherency state of the target memory block with respect to the cache memory. In response to the determination, the cache memory selects a scope of broadcast of an initialization request identifying the target memory block. A narrower scope including the first coherency domain and excluding the second coherency domain is selected in response to a determination of a first coherency state, and a broader scope including the first coherency domain and the second coherency domain is selected in response to a determination of a second coherency state. The cache memory then broadcasts an initialization request with the selected scope. In response to the initialization request, the target memory block is initialized within a memory of the data processing system to an initialization value.

    摘要翻译: 数据处理系统至少包括第一和第二相干域,每个域包括至少一个处理器核和存储器。 响应于指示要初始化的目标存储器块的处理器核心的初始化操作,第一相干域中的高速缓存存储器确定目标存储器块相对于高速缓冲存储器的一致性状态。 响应于该确定,高速缓存存储器选择识别目标存储器块的初始化请求的广播范围。 响应于第一相关性状态的确定而选择包括第一相关域并且排除第二相关性域的较窄范围,并且响应于确定第一相关性域的第一相关性域和第二相关域 第二一致性状态。 然后,高速缓冲存储器播放具有所选范围的初始化请求。 响应于初始化请求,将目标存储器块在数据处理系统的存储器内初始化为初始化值。

    Data processing system, method and interconnect fabric supporting multiple planes of processing nodes
    5.
    发明申请
    Data processing system, method and interconnect fabric supporting multiple planes of processing nodes 有权
    支持多个处理节点平面的数据处理系统,方法和互连结构

    公开(公告)号:US20070081516A1

    公开(公告)日:2007-04-12

    申请号:US11245887

    申请日:2005-10-07

    IPC分类号: H04L12/28

    CPC分类号: G06F15/16

    摘要: A data processing system includes a first plane including a first plurality of processing nodes, each including multiple processing units, and a second plane including a second plurality of processing nodes, each including multiple processing units. The data processing system also includes a plurality of point-to-point first tier links. Each of the first plurality and second plurality of processing nodes includes one or more first tier links among the plurality of first tier links, where the first tier link(s) within each processing node connect a pair of processing units in the same processing node for communication. The data processing system further includes a plurality of point-to-point second tier links. At least a first of the plurality of second tier links connects processing units in different ones of the first plurality of processing nodes, at least a second of the plurality of second tier links connects processing units in different ones of the second plurality of processing nodes, and at least a third of the plurality of second tier links connects a processing unit in the first plane to a processing unit in the second plane.

    摘要翻译: 数据处理系统包括包括第一多个处理节点的第一平面,每个处理节点包括多个处理单元,以及包括第二多个处理节点的第二平面,每个处理节点包括多个处理单元。 数据处理系统还包括多个点对点第一层链路。 第一多个处理节点和第二多个处理节点中的每一个包括多个第一层链路之中的一个或多个第一层链路,其中每个处理节点内的第一层链路连接相同处理节点中的一对处理单元,用于 通讯。 数据处理系统还包括多个点到点第二层链路。 所述多个第二层链路中的至少第一层连接所述第一多个处理节点中的不同处理节点中的处理单元,所述多个第二层链路中的至少一个链接连接所述第二多个处理节点中的不同处理节点中的处理单元, 并且所述多个第二层链路中的至少三分之一链路将所述第一平面中的处理单元连接到所述第二平面中的处理单元。

    Method and system for specualtively sending processor-issued store operations to a store queue with full signal asserted
    6.
    发明申请
    Method and system for specualtively sending processor-issued store operations to a store queue with full signal asserted 失效
    方法和系统,用于将处理器发出的存储操作特定发送到存储队列,并发出全信号

    公开(公告)号:US20050251660A1

    公开(公告)日:2005-11-10

    申请号:US10840560

    申请日:2004-05-06

    IPC分类号: G06F9/30

    摘要: A method and processor chip design for enabling a processor core to continue sending store operations speculatively to the store queue after the core receives indication that the store queue is full. The processor core is configured with speculative store logic that enables the processor core to continue issuing store operations while the store queue full signal is asserted. A copy of the speculatively issued store operation is placed within a speculative store buffer. The core waits for a signal from the store queue indicating the store operation was accepted into the store queue. When the speculatively-issued store operation is accepted within the store queue, the copy is discarded from the buffer. However, when the store operation is rejected, the speculative store logic re-issues the store operation ahead of normal store operations.

    摘要翻译: 一种方法和处理器芯片设计,用于使得处理器核心能够在核心接收到存储队列已满的指示之后继续向商店队列发送存储操作。 处理器核心配置有推测存储逻辑,使得处理器核心能够在存储队列满信号被断言的同时继续发出存储操作。 投机发行的存储操作的副本放置在推测性存储缓冲区内。 核心等待来自存储队列的信号,指示存储操作被接受到存储队列中。 当存储队列中接受推测发出的存储操作时,该副本将从缓冲区中丢弃。 然而,当存储操作被拒绝时,推测存储逻辑在正常存储操作之前重新发布存储操作。

    Reducing number of rejected snoop requests by extending time to respond to snoop request

    公开(公告)号:US20060184749A1

    公开(公告)日:2006-08-17

    申请号:US11056764

    申请日:2005-02-11

    IPC分类号: G06F13/28

    CPC分类号: G06F12/0831

    摘要: A cache, system and method for reducing the number of rejected snoop requests. An incoming snoop request is entered in the first available latch in a pipeline of latches in a stall/reorder unit if the stall/reorder unit is not full. The entered snoop request is dispatched to a selector upon entering a bottom latch in the pipeline. The stall/reorder unit is not informed as to whether the dispatched snoop request is accepted by an arbitration mechanism for several clock cycles after the dispatch occurred. A copy of the dispatched snoop request is stored in a top latch in an overrun pipeline of latches in the first unit upon dispatching the snoop request. By maintaining information about the snoop request, the snoop request may be dispatched again to the selector in case the dispatched snoop request was rejected thereby increasing the chance that the snoop request will ultimately be accepted.

    System and method of re-ordering store operations within a processor
    8.
    发明申请
    System and method of re-ordering store operations within a processor 失效
    在处理器内重新排序存储操作的系统和方法

    公开(公告)号:US20060179226A1

    公开(公告)日:2006-08-10

    申请号:US11054450

    申请日:2005-02-09

    IPC分类号: G06F12/00

    摘要: A system and method for re-ordering store operations from a processor core to a store queue. When a store queue receives a new processor-issued store operation from the processor core, a store queue controller allocates a new entry in the store queue. In response to allocating the new entry in the store queue, the store queue controller determines whether or not the new entry is dependent on at least one other valid entry in the store queue. In response to determining the new entry is dependent on at least one other valid entry in the store queue, the store queue controller inhibits requesting of the new entry to the RC dispatch logic until each valid entry on which the new entry is dependent has been successfully dispatched to an RC machine by the RC dispatch logic.

    摘要翻译: 一种用于重新排序从处理器核到存储队列的存储操作的系统和方法。 当存储队列从处理器核心接收到新的处理器发出的存储操作时,存储队列控制器在存储队列中分配新的条目。 响应于在商店队列中分配新条目,商店队列控制器确定新条目是否依赖于商店队列中的至少一个其他有效条目。 响应于确定新条目取决于存储队列中的至少一个其他有效条目,存储队列控制器禁止向RC调度逻辑请求新条目,直到新条目依赖于其上的每个有效条目已经成功 通过RC调度逻辑调度到RC机器。

    Reducing number of rejected snoop requests by extending time to respond to snoop request
    9.
    发明申请
    Reducing number of rejected snoop requests by extending time to respond to snoop request 失效
    通过延长响应窥探请求的时间来减少被拒绝的窥探请求数

    公开(公告)号:US20060184746A1

    公开(公告)日:2006-08-17

    申请号:US11056679

    申请日:2005-02-11

    IPC分类号: G06F13/28

    CPC分类号: G06F13/1605 G06F12/0831

    摘要: A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. The snoop request is entered in the first available latch of the stall/reorder unit unless the stall/reorder unit is full in which case the new snoop request is transmitted to a second unit configured to transmit a request to retry resending the new snoop request. Snoop requests have a higher priority than requests from processors and snoop requests are selected by the arbitration mechanism over processor requests unless the arbitration mechanism requests otherwise (“stall request”) to the stall/reorder unit. By snoop requests having a higher priority than processor requests, the number of snoop requests rejected is reduced. By having the arbitration mechanism issue a stall request, the processor will not be starved.

    摘要翻译: 用于减少拒绝的窥探请求数量的缓存,系统和方法。 缓存中的“停止/重新排序单元”从互连中接收窥探请求。 监听请求被输入到停止/重新排序单元的第一可用锁存器中,除非停止/重新排序单元已满,在这种情况下,新的窥探请求被发送到被配置为发送重新发送新的窥探请求的请求的第二单元。 侦听请求具有比来自处理器的请求更高的优先级,并且仲裁机制通过处理器请求选择侦听请求,除非仲裁机制另请求(“停止请求”)到停止/重新排序单元。 通过具有比处理器请求更高优先级的侦听请求,减少了被拒绝的侦听请求的数量。 通过使仲裁机制发出停顿请求,处理器不会饿死。

    System bus read data transfers with data ordering control bits
    10.
    发明申请
    System bus read data transfers with data ordering control bits 失效
    系统总线使用数据排序控制位读取数据传输

    公开(公告)号:US20050193174A1

    公开(公告)日:2005-09-01

    申请号:US11041711

    申请日:2005-01-22

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0831

    摘要: A method for informing a processor of a selected order of transmission of data to the processor. The method comprises the steps of coupling system components via a data bus to the processor to effectuate data transfer, determining at the system component logic the order in which to transmit data to the processor, and issuing to the data bus a selected order bit concurrent with the data, wherein the selected order bit alerts the processor of the order and the data is transmitted in that order. In a preferred embodiment, the system component is the cache and the method may involve receiving at the cache a preference of ordering for a read address/request from the processor. The preference order logic of the cache controller or a preference order logic component evaluates the preference of ordering desired by comparing the processor preference with other preferences, including cache order preference. One preference order is selected and the data is then retrieved from a cache line of the cache in the order selected.

    摘要翻译: 一种用于向处理器通知所选择的数据传输顺序的处理器的方法。 该方法包括以下步骤:将系统组件经由数据总线耦合到处理器以实现数据传输,在系统组件逻辑处确定将数据发送到处理器的顺序,以及向数据总线发出与 数据,其中所选择的订单位向处理器提醒订单,并且以该顺序传送数据。 在优选实施例中,系统组件是高速缓存,并且该方法可以涉及在高速缓存处接收对来自处理器的读取地址/请求的排序的偏好。 高速缓存控制器或偏好顺序逻辑组件的偏好顺序逻辑通过将处理器偏好与其他偏好(包括高速缓存顺序偏好)进行比较来评估期望的顺序的偏好。 选择一个偏好顺序,然后以所选顺序从高速缓存的高速缓存行检索数据。