DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT REDUCE STORE QUEUE ENTRY UTILIZATION FOR SYNCHRONIZING OPERATIONS
    1.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT REDUCE STORE QUEUE ENTRY UTILIZATION FOR SYNCHRONIZING OPERATIONS 失效
    数据处理系统,处理器和数据处理方法减少存储队列进入同步操作的使用

    公开(公告)号:US20070250669A1

    公开(公告)日:2007-10-25

    申请号:US11380020

    申请日:2006-04-25

    IPC分类号: G06F13/00

    摘要: A data processing system includes a processor core and a memory subsystem. The memory subsystem includes a store queue having a plurality of entries, where each entry includes an address field for holding the target address of store operation, a data field for holding data for the store operation, and a virtual sync field indicating a presence or absence of a synchronizing operation associated with the entry. The memory subsystem further includes a store queue controller that, responsive to receipt at the memory subsystem of a sequence of operations including a synchronizing operation and a particular store operation, places a target address and data of the particular store operation within the address field and data field, respectively, of an entry in the store queue and sets the virtual sync field of the entry to represent the synchronizing operation, such that a number of store queue entries utilized is reduced.

    摘要翻译: 数据处理系统包括处理器核心和存储器子系统。 存储器子系统包括具有多个条目的存储队列,其中每个条目包括用于保存存储操作的目标地址的地址字段,用于保存用于存储操作的数据的数据字段和指示存在或不存在的虚拟同步字段 与该条目相关联的同步操作。 存储器子系统还包括存储队列控制器,其响应于在存储器子系统处的接收包括同步操作和特定存储操作的一系列操作,将特定存储操作的目标地址和数据放置在地址字段和数据中 字段,并且设置条目的虚拟同步字段以表示同步操作,使得减少使用的存储队列条目的数量。

    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT SUPPORT MEMORY ACCESS ACCORDING TO DIVERSE MEMORY MODELS
    2.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING THAT SUPPORT MEMORY ACCESS ACCORDING TO DIVERSE MEMORY MODELS 失效
    数据处理系统,处理器和数据处理方法,支持根据多个存储器模型的存储器访问

    公开(公告)号:US20070250668A1

    公开(公告)日:2007-10-25

    申请号:US11380018

    申请日:2006-04-25

    IPC分类号: G06F13/00

    摘要: A data processing system includes a memory subsystem and an execution unit, coupled to the memory subsystem, which executes store instructions to determine target memory addresses of store operations to be performed by the memory subsystem. The data processing system further includes a mode field having a first setting indicating strong ordering between store operations and a second setting indicating weak ordering between store operations. Store operations accessing the memory subsystem are associated with either the first setting or the second setting. The data processing system also includes logic that, based upon settings of the mode field, inserts a synchronizing operation between a store operation associated with the first setting and a store operation associated with the second setting, such that all store operations preceding the synchronizing operation complete before store operations subsequent to the synchronizing operation.

    摘要翻译: 数据处理系统包括存储器子系统和执行单元,其耦合到存储器子系统,其执行存储指令以确定要由存储器子系统执行的存储操作的目标存储器地址。 数据处理系统还包括具有指示存储操作之间的强顺序的第一设置的模式字段和指示存储操作之间的弱顺序的第二设置。 访问内存子系统的存储操作与第一个设置或第二个设置相关联。 数据处理系统还包括基于模式字段的设置的逻辑,在与第一设置相关联的存储操作与与第二设置相关联的存储操作之间插入同步操作,使得同步操作之前的所有存储操作完成 在同步操作之后的存储操作之前。

    Method for completing full cacheline stores with address-only bus operations
    3.
    发明申请
    Method for completing full cacheline stores with address-only bus operations 有权
    完成具有仅地址总线操作的完整缓存线存储的方法

    公开(公告)号:US20050251623A1

    公开(公告)日:2005-11-10

    申请号:US10825189

    申请日:2004-04-15

    IPC分类号: G06F12/00 G06F12/08

    CPC分类号: G06F12/0897 G06F12/0804

    摘要: A method and processor system that substantially eliminates data bus operations when completing updates of an entire cache line with a full store queue entry. The store queue within a processor chip is designed with a series of AND gates connecting individual bits of the byte enable bits of a corresponding entry. The AND output is fed to the STQ controller and signals when the entry is full. When full entries are selected for dispatch to the RC machines, the RC machine is signaled that the entry updates the entire cache line. The RC machine obtains write permission to the line, and then the RC machine overwrites the entire cache line. Because the entire cache line is overwritten, the data of the cache line is not retrieved when the request for the cache line misses at the cache or when data goes state before write permission is obtained by the RC machine.

    摘要翻译: 一种方法和处理器系统,其在完成具有完整存储队列条目的整个高速缓存行的更新时基本上消除数据总线操作。 处理器芯片内的存储队列设计有连接相应条目的字节使能位的各个位的一系列与门。 AND输出被馈送到STQ控制器,并在条目已满时发出信号。 当选择完整条目以发送到RC机器时,RC机器发出信号,表示该条目更新整个高速缓存行。 RC机器获得线路的写入权限,然后RC机器覆盖整个高速缓存行。 由于整个高速缓存线被覆盖,当缓存线的请求在高速缓存中丢失时或在RC机器获得写入许可之前数据进入状态时,不会检索高速缓存行的数据。

    Method to stall store operations to increase chances of gathering full entries for updating cachelines
    4.
    发明申请
    Method to stall store operations to increase chances of gathering full entries for updating cachelines 失效
    停止存储操作以增加收集完整条目以更新高速缓存行的机会的方法

    公开(公告)号:US20050251622A1

    公开(公告)日:2005-11-10

    申请号:US10825188

    申请日:2004-04-15

    摘要: A method and processor system that substantially enhances the store gathering capabilities of a store queue entry to enable gathering of a maximum number of proximate-in-time store operations before the entry is selected for dispatch. A counter is provided for each entry to track a time since a last gather to the entry. When a new gather does not occur before the counter reaches a threshold saturation point, the entry is signaled ready for dispatch. By defining an optimum threshold saturation point before the counter expires, sufficient time is provided for the entry to gather a proximate-in-time store operation. The entry may be deemed eligible for selection when certain conditions occur, including the entry becoming full, issuance of a barrier operation, and saturation of the counter. The use of the counter increases the ability of a store queue entry to complete gathering of enough store operations to update an entire cache line before that entry is dispatched to an RC machine.

    摘要翻译: 一种方法和处理器系统,其基本上增强了存储队列条目的存储收集能力,以便能够在该条目被选择用于发送之前收集最大数量的接近时间存储操作。 为每个条目提供一个计数器,以跟踪从上次收集到条目的时间。 当计数器达到阈值饱和点之前没有发生新的聚合时,该信号将被发出准备就绪。 通过在计数器到期之前定义最佳阈值饱和点,为入口提供足够的时间来收集即时存储操作。 当某些条件发生时,条目可能被视为有资格进行选择,包括条目变满,发出屏障操作和计数器的饱和。 计数器的使用增加了存储队列条目完成收集足够的存储操作以在将该条目分派到RC机器之前更新整个高速缓存行的能力。

    Processor, data processing system, and method for initializing a memory block in a data processing system having multiple coherency domains
    5.
    发明申请
    Processor, data processing system, and method for initializing a memory block in a data processing system having multiple coherency domains 有权
    处理器,数据处理系统和用于初始化具有多个相干域的数据处理系统中的存储器块的方法

    公开(公告)号:US20070226423A1

    公开(公告)日:2007-09-27

    申请号:US11388001

    申请日:2006-03-23

    IPC分类号: G06F13/28

    CPC分类号: G06F12/0822 G06F12/084

    摘要: A data processing system includes at least first and second coherency domains, each including at least one processor core and a memory. In response to an initialization operation by a processor core that indicates a target memory block to be initialized, a cache memory in the first coherency domain determines a coherency state of the target memory block with respect to the cache memory. In response to the determination, the cache memory selects a scope of broadcast of an initialization request identifying the target memory block. A narrower scope including the first coherency domain and excluding the second coherency domain is selected in response to a determination of a first coherency state, and a broader scope including the first coherency domain and the second coherency domain is selected in response to a determination of a second coherency state. The cache memory then broadcasts an initialization request with the selected scope. In response to the initialization request, the target memory block is initialized within a memory of the data processing system to an initialization value.

    摘要翻译: 数据处理系统至少包括第一和第二相干域,每个域包括至少一个处理器核和存储器。 响应于指示要初始化的目标存储器块的处理器核心的初始化操作,第一相干域中的高速缓存存储器确定目标存储器块相对于高速缓冲存储器的一致性状态。 响应于该确定,高速缓存存储器选择识别目标存储器块的初始化请求的广播范围。 响应于第一相关性状态的确定而选择包括第一相关域并且排除第二相关性域的较窄范围,并且响应于确定第一相关性域的第一相关性域和第二相关域 第二一致性状态。 然后,高速缓冲存储器播放具有所选范围的初始化请求。 响应于初始化请求,将目标存储器块在数据处理系统的存储器内初始化为初始化值。

    Data processing system, method and interconnect fabric supporting multiple planes of processing nodes
    6.
    发明申请
    Data processing system, method and interconnect fabric supporting multiple planes of processing nodes 有权
    支持多个处理节点平面的数据处理系统,方法和互连结构

    公开(公告)号:US20070081516A1

    公开(公告)日:2007-04-12

    申请号:US11245887

    申请日:2005-10-07

    IPC分类号: H04L12/28

    CPC分类号: G06F15/16

    摘要: A data processing system includes a first plane including a first plurality of processing nodes, each including multiple processing units, and a second plane including a second plurality of processing nodes, each including multiple processing units. The data processing system also includes a plurality of point-to-point first tier links. Each of the first plurality and second plurality of processing nodes includes one or more first tier links among the plurality of first tier links, where the first tier link(s) within each processing node connect a pair of processing units in the same processing node for communication. The data processing system further includes a plurality of point-to-point second tier links. At least a first of the plurality of second tier links connects processing units in different ones of the first plurality of processing nodes, at least a second of the plurality of second tier links connects processing units in different ones of the second plurality of processing nodes, and at least a third of the plurality of second tier links connects a processing unit in the first plane to a processing unit in the second plane.

    摘要翻译: 数据处理系统包括包括第一多个处理节点的第一平面,每个处理节点包括多个处理单元,以及包括第二多个处理节点的第二平面,每个处理节点包括多个处理单元。 数据处理系统还包括多个点对点第一层链路。 第一多个处理节点和第二多个处理节点中的每一个包括多个第一层链路之中的一个或多个第一层链路,其中每个处理节点内的第一层链路连接相同处理节点中的一对处理单元,用于 通讯。 数据处理系统还包括多个点到点第二层链路。 所述多个第二层链路中的至少第一层连接所述第一多个处理节点中的不同处理节点中的处理单元,所述多个第二层链路中的至少一个链接连接所述第二多个处理节点中的不同处理节点中的处理单元, 并且所述多个第二层链路中的至少三分之一链路将所述第一平面中的处理单元连接到所述第二平面中的处理单元。

    METHOD AND DATA PROCESSING SYSTEM FOR MICROPROCESSOR COMMUNICATION IN A CLUSTER-BASED MULTI-PROCESSOR SYSTEM
    7.
    发明申请
    METHOD AND DATA PROCESSING SYSTEM FOR MICROPROCESSOR COMMUNICATION IN A CLUSTER-BASED MULTI-PROCESSOR SYSTEM 失效
    基于群集多处理器系统的微处理器通信的方法和数据处理系统

    公开(公告)号:US20080091918A1

    公开(公告)日:2008-04-17

    申请号:US11952479

    申请日:2007-12-07

    IPC分类号: G06F15/76 G06F9/02

    摘要: A processor communication register (PCR) contained within a multiprocessor cluster system provides enhanced processor communication. The PCR stores information that is useful in pipelined or parallel multi-processing. Each processor cluster has exclusive rights to store to a sector within the PCR and has continuous access to read its contents. Each processor cluster updates its exclusive sector within the PCR, instantly allowing all of the other processors within the cluster network to see the change within the PCR data, and bypassing the cache subsystem. Efficiency is enhanced within the processor cluster network by providing processor communications to be immediately networked and transferred into all processors without momentarily restricting access to the information or forcing all the processors to be continually contending for the same cache line, and thereby overwhelming the interconnect and memory system with an endless stream of load, store and invalidate commands.

    摘要翻译: 包含在多处理器集群系统内的处理器通信寄存器(PCR)提供增强的处理器通信。 PCR存储在流水线或并行多处理中有用的信息。 每个处理器集群具有存储到PCR中的扇区的独占权限,并且具有连续访问以读取其内容。 每个处理器集群在PCR中更新其独占部分,立即允许集群网络内的所有其他处理器查看PCR数据中的更改,并绕过缓存子系统。 处理器集群网络中的效率得到提高,通过提供处理器通信来立即联网并传输到所有处理器中,而不会立即限制对信息的访问,或迫使所有处理器持续竞争相同的高速缓存行,从而压倒互连和内存 系统具有无限的加载流,存储和无效命令。

    System bus read data transfers with data ordering control bits
    8.
    发明申请
    System bus read data transfers with data ordering control bits 失效
    系统总线使用数据排序控制位读取数据传输

    公开(公告)号:US20050193174A1

    公开(公告)日:2005-09-01

    申请号:US11041711

    申请日:2005-01-22

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0831

    摘要: A method for informing a processor of a selected order of transmission of data to the processor. The method comprises the steps of coupling system components via a data bus to the processor to effectuate data transfer, determining at the system component logic the order in which to transmit data to the processor, and issuing to the data bus a selected order bit concurrent with the data, wherein the selected order bit alerts the processor of the order and the data is transmitted in that order. In a preferred embodiment, the system component is the cache and the method may involve receiving at the cache a preference of ordering for a read address/request from the processor. The preference order logic of the cache controller or a preference order logic component evaluates the preference of ordering desired by comparing the processor preference with other preferences, including cache order preference. One preference order is selected and the data is then retrieved from a cache line of the cache in the order selected.

    摘要翻译: 一种用于向处理器通知所选择的数据传输顺序的处理器的方法。 该方法包括以下步骤:将系统组件经由数据总线耦合到处理器以实现数据传输,在系统组件逻辑处确定将数据发送到处理器的顺序,以及向数据总线发出与 数据,其中所选择的订单位向处理器提醒订单,并且以该顺序传送数据。 在优选实施例中,系统组件是高速缓存,并且该方法可以涉及在高速缓存处接收对来自处理器的读取地址/请求的排序的偏好。 高速缓存控制器或偏好顺序逻辑组件的偏好顺序逻辑通过将处理器偏好与其他偏好(包括高速缓存顺序偏好)进行比较来评估期望的顺序的偏好。 选择一个偏好顺序,然后以所选顺序从高速缓存的高速缓存行检索数据。

    Efficient and flexible memory copy operation
    9.
    发明申请
    Efficient and flexible memory copy operation 失效
    高效灵活的内存复制操作

    公开(公告)号:US20070150676A1

    公开(公告)日:2007-06-28

    申请号:US11316663

    申请日:2005-12-22

    IPC分类号: G06F12/16

    摘要: A system, method, and computer program product for semi-synchronously copying data from a first portion of memory to a second portion of memory are disclosed. The method comprises receiving, in a processor, a call for a semi-synchronous memory copy operation. The semi-synchronous memory copy operation preserves temporal persistence of validity for a virtual source address corresponding to a source location in a memory and a virtual target address corresponding to a target location in the memory by setting a flag bit. The call includes at least the virtual source address, the virtual target address, and an indicator identifying a number of bytes to be copied. The memory copy operation is placed in a queue for execution by a memory controller. The queue is coupled to the memory controller. At least one subsequent instruction is continued to be executed as the subsequent instruction becomes available from an instruction pipeline.

    摘要翻译: 公开了一种用于将数据从存储器的第一部分半数同步地复制到存储器的第二部分的系统,方法和计算机程序产品。 该方法包括在处理器中接收对半同步存储器复制操作的呼叫。 半同步存储器复制操作通过设置标志位来保持对应于存储器中的源位置的虚拟源地址和对应于存储器中的目标位置的虚拟目标地址的有效性的时间持续性。 该呼叫至少包括虚拟源地址,虚拟目标地址和标识要复制的字节数的指示符。 存储器复制操作被放置在队列中以由存储器控制器执行。 队列耦合到存储器控制器。 随着随后的指令从指令流水线可用,继续执行至少一个后续指令。

    Validity of address ranges used in semi-synchronous memory copy operations
    10.
    发明申请
    Validity of address ranges used in semi-synchronous memory copy operations 有权
    在半同步存储器复制操作中使用的地址范围的有效性

    公开(公告)号:US20070150675A1

    公开(公告)日:2007-06-28

    申请号:US11315757

    申请日:2005-12-22

    IPC分类号: G06F12/16

    摘要: A system, method, and a computer readable for protecting content of a memory page are disclosed. The method includes determining a start of a semi-synchronous memory copy operation. A range of addresses is determined where the semi-synchronous memory copy operation is being performed. An issued instruction that removes a page table entry is detected. The method further includes determining whether the issued instruction is destined to remove a page table entry associated with at least one address in the range of addresses. In response to the issued instruction being destined to remove the page table entry, the execution of the issued instruction is stalled until the semi-synchronous memory copy operation is completed.

    摘要翻译: 公开了一种用于保护存储器页面的内容的系统,方法和可读取的计算机。 该方法包括确定半同步存储器复制操作的开始。 确定正在执行半同步存储器复制操作的地址范围。 检测到发出的删除页表条目的指令。 所述方法还包括确定所发出的指令是否旨在去除与地址范围中的至少一个地址相关联的页表条目。 响应于发出的指令旨在去除页表条目,所发出的指令的执行停止,直到半同步存储器复制操作完成。