Cache Management for Memory Operations
    1.
    发明申请
    Cache Management for Memory Operations 有权
    内存操作缓存管理

    公开(公告)号:US20130262775A1

    公开(公告)日:2013-10-03

    申请号:US13436767

    申请日:2012-03-30

    IPC分类号: G06F12/08

    摘要: Embodiments of the present invention provides for the execution of threads and/or workitems on multiple processors of a heterogeneous computing system in a manner that they can share data correctly and efficiently. Disclosed method, system, and article of manufacture embodiments include, responsive to an instruction from a sequence of instructions of a work-item, determining an ordering of visibility to other work-items of one or more other data items in relation to a particular data item, and performing at least one cache operation upon at least one of the particular data item or the other data items present in any one or more cache memories in accordance with the determined ordering. The semantics of the instruction includes a memory operation upon the particular data item.

    摘要翻译: 本发明的实施例提供在异构计算系统的多个处理器上执行线程和/或工作项,以使得它们可以正确且有效地共享数据。 公开的方法,系统和制品实施例包括响应于来自工作项目的指令序列的指令,确定与特定数据相关的一个或多个其他数据项的其他工作项的可见性的排序 并且根据所确定的顺序对存在于任何一个或多个高速缓存存储器中的特定数据项或其他数据项中的至少一个执行至少一个高速缓存操作。 指令的语义包括对特定数据项的存储器操作。

    Cache management for memory operations
    2.
    发明授权
    Cache management for memory operations 有权
    内存操作缓存管理

    公开(公告)号:US08935475B2

    公开(公告)日:2015-01-13

    申请号:US13436767

    申请日:2012-03-30

    IPC分类号: G06F12/02 G06F12/08 G06F12/12

    摘要: Embodiments of the present invention provides for the execution of threads and/or workitems on multiple processors of a heterogeneous computing system in a manner that they can share data correctly and efficiently. Disclosed method, system, and article of manufacture embodiments include, responsive to an instruction from a sequence of instructions of a work-item, determining an ordering of visibility to other work-items of one or more other data items in relation to a particular data item, and performing at least one cache operation upon at least one of the particular data item or the other data items present in any one or more cache memories in accordance with the determined ordering. The semantics of the instruction includes a memory operation upon the particular data item.

    摘要翻译: 本发明的实施例提供在异构计算系统的多个处理器上执行线程和/或工作项,以使得它们可以正确且有效地共享数据。 公开的方法,系统和制品实施例包括响应于来自工作项目的指令序列的指令,确定与特定数据相关的一个或多个其他数据项的其他工作项的可见性的排序 并且根据所确定的顺序对存在于任何一个或多个高速缓存存储器中的特定数据项或其他数据项中的至少一个执行至少一个高速缓存操作。 指令的语义包括对特定数据项的存储器操作。

    Visibility Ordering in a Memory Model for a Unified Computing System
    3.
    发明申请
    Visibility Ordering in a Memory Model for a Unified Computing System 有权
    在统一计算系统的内存模型中的可见性排序

    公开(公告)号:US20130263141A1

    公开(公告)日:2013-10-03

    申请号:US13588310

    申请日:2012-08-17

    IPC分类号: G06F9/46

    摘要: Provided is a method of permitting the reordering of a visibility order of operations in a computer arrangement configured for permitting a first processor and a second processor threads to access a shared memory. The method includes receiving in a program order, a first and a second operation in a first thread and permitting the reordering of the visibility order for the operations in the shared memory based on the class of each operation. The visibility order determines the visibility in the shared memory, by a second thread, of stored results from the execution of the first and second operations.

    摘要翻译: 提供了一种允许重新排序配置为允许第一处理器和第二处理器线程访问共享存储器的计算机配置中的操作的可见性顺序的方法。 该方法包括以程序顺序接收第一线程中的第一和第二操作,并且基于每个操作的类别允许对共享存储器中的操作的可见性顺序的重新排序。 可见性顺序确定共享存储器(第二个线程)中可执行第一和第二操作的存储结果的可见性。

    Visibility ordering in a memory model for a unified computing system
    4.
    发明授权
    Visibility ordering in a memory model for a unified computing system 有权
    在统一计算系统的内存模型中的可见性排序

    公开(公告)号:US08984511B2

    公开(公告)日:2015-03-17

    申请号:US13588310

    申请日:2012-08-17

    摘要: Provided is a method of permitting the reordering of a visibility order of operations in a computer arrangement configured for permitting a first processor and a second processor threads to access a shared memory. The method includes receiving in a program order, a first and a second operation in a first thread and permitting the reordering of the visibility order for the operations in the shared memory based on the class of each operation. The visibility order determines the visibility in the shared memory, by a second thread, of stored results from the execution of the first and second operations.

    摘要翻译: 提供了一种允许重新排序配置为允许第一处理器和第二处理器线程访问共享存储器的计算机配置中的操作的可见性顺序的方法。 该方法包括以程序顺序接收第一线程中的第一和第二操作,并且基于每个操作的类别允许对共享存储器中的操作的可见性顺序的重新排序。 可见性顺序确定共享存储器(第二个线程)中可执行第一和第二操作的存储结果的可见性。

    Managing coherent memory between an accelerated processing device and a central processing unit
    5.
    发明授权
    Managing coherent memory between an accelerated processing device and a central processing unit 有权
    管理加速处理设备和中央处理单元之间的连贯内存

    公开(公告)号:US09430391B2

    公开(公告)日:2016-08-30

    申请号:US13601126

    申请日:2012-08-31

    摘要: Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter.

    摘要翻译: 现有的多处理器计算系统通常具有不足的存储器一致性,因此不能有效地利用单独的存储器系统。 具体来说,CPU无法有效地写入内存块,然后除了有明确的同步之外,还可以对存储器进行GPU访问。 另外,由于GPU被迫静态分割其本身与CPU之间的存储器位置,所以现有的多处理器计算系统不能有效地利用单独的存储器系统。 本文所描述的实施例通过在GPU内接收到通知,CPU已经完成处理存储在相干存储器中的数据,并使CPU缓冲器中的数据无效,GPU已经从相干存储器完成处理来克服这些缺陷。 本文描述的实施例还包括通过使用探针滤波器来将GPU存储器动态地划分为相干存储器和本地存储器。

    Mapping Memory Instructions into a Shared Memory Address Place
    6.
    发明申请
    Mapping Memory Instructions into a Shared Memory Address Place 审中-公开
    将内存指令映射到共享内存地址

    公开(公告)号:US20130262814A1

    公开(公告)日:2013-10-03

    申请号:US13588790

    申请日:2012-08-17

    IPC分类号: G06F12/10

    摘要: Embodiments of the present invention provide a method of a first processor using a memory resource associated with a second processor. The method includes receiving a memory instruction from a first processor process, wherein the memory instruction refers to a shared memory address (SMA) that maps to a second processor memory. The method also includes mapping the SMA to the second processor memory, wherein the mapping produces a mapping result and providing the mapping result to the first processor.

    摘要翻译: 本发明的实施例提供了使用与第二处理器相关联的存储器资源的第一处理器的方法。 该方法包括从第一处理器处理接收存储器指令,其中存储器指令是指映射到第二处理器存储器的共享存储器地址(SMA)。 该方法还包括将SMA映射到第二处理器存储器,其中映射产生映射结果并将映射结果提供给第一处理器。

    Shared memory space in a unified memory model
    7.
    发明授权
    Shared memory space in a unified memory model 有权
    共享内存空间在统一的内存模型中

    公开(公告)号:US09009419B2

    公开(公告)日:2015-04-14

    申请号:US13562985

    申请日:2012-07-31

    IPC分类号: G06F12/02 G06F12/06

    摘要: Methods and systems are provided for mapping a memory instruction to a shared memory address space in a computer arrangement having a CPU and an APD. A method includes receiving a memory instruction that refers to an address in the shared memory address space, mapping the memory instruction based on the address to a memory resource associated with either the CPU or the APD, and performing the memory instruction based on the mapping.

    摘要翻译: 提供了用于将存储器指令映射到具有CPU和APD的计算机装置中的共享存储器地址空间的方法和系统。 一种方法包括接收参考共享存储器地址空间中的地址的存储器指令,将基于地址的存储器指令映射到与CPU或APD相关联的存储器资源,以及基于映射执行存储器指令。

    Writeback cancellation processing system for use in a packet switched
cache coherent multiprocessor system
    8.
    发明授权
    Writeback cancellation processing system for use in a packet switched cache coherent multiprocessor system 失效
    回写取消处理系统,用于分组交换高速缓存一致多处理器系统

    公开(公告)号:US5684977A

    公开(公告)日:1997-11-04

    申请号:US415040

    申请日:1995-03-31

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0828 G06F12/0822

    摘要: A multiprocessor computer system is provided having a multiplicity of sub-systems and a main memory coupled to a system controller. An interconnect module, interconnects the main memory and sub-systems in accordance with interconnect control signals received from the system controller. At least two of the sub-systems are data processors, each having a respective cache memory that stores multiple blocks of data and a set of master cache tags (Etags), including one cache tag for each data block stored by the cache memory. Each data processor includes a master interface for sending memory transaction requests to the system controller. The system controller processes each memory transaction and maintains a set of duplicate cache tags (Dtags) for each data processor. Finally, the system controller contains transaction execution circuitry for activating a transaction for servicing by the interconnect. The transaction execution circuitry pipelines memory access requests from the data processors, and includes invalidation circuitry for processing each writeback request from a given data processor prior to activation to determine if the Dtag index corresponding to the victimized cache line is invalid. Thereafter, the invalidation circuitry activates writeback requests only if the Dtag index is not invalid and cancels the writeback request if the Dtag index is invalid.

    摘要翻译: 提供了具有多个子系统和耦合到系统控制器的主存储器的多处理器计算机系统。 互连模块根据从系统控制器接收的互连控制信号,互连主存储器和子系统。 至少两个子系统是数据处理器,每个数据处理器具有存储多个数据块的相应缓存存储器和一组主缓存标签(Etags),包括由高速缓存存储器存储的每个数据块的一个高速缓存标签。 每个数据处理器包括用于向系统控制器发送存储器事务请求的主接口。 系统控制器处理每个存储器事务,并为每个数据处理器维护一组重复的缓存标签(Dtags)。 最后,系统控制器包含用于激活交易以进行互连维修的事务执行电路。 交易执行电路管理来自数据处理器的存储器访问请求,并且包括用于在激活之前处理来自给定数据处理器的每个回写请求的无效电路,以确定与受害高速缓存行对应的Dtag索引是否无效。 此后,无效电路仅在Dtag索引无效时才激活写回请求,如果Dtag索引无效则取消写回请求。

    Fast, dual ported cache controller for data processors in a packet
switched cache coherent multiprocessor system
    9.
    发明授权
    Fast, dual ported cache controller for data processors in a packet switched cache coherent multiprocessor system 失效
    快速,双端口缓存控制器,用于数据包交换缓存一致多处理器系统中的数据处理器

    公开(公告)号:US5644753A

    公开(公告)日:1997-07-01

    申请号:US714965

    申请日:1996-09-17

    IPC分类号: G11C11/41 G06F12/08 G06F13/00

    摘要: A multiprocessor computer system has data processors and a main memory coupled to a system controller. Each data processor has a cache memory. Each cache memory has a cache controller with two ports for receiving access requests. A first port receives access requests from the associated data processor and a second port receives access requests from the system controller. All cache memory access requests include an address value; access requests from the system controller also include a mode flag. A comparator in the cache controller processes the address value in each access request and generates a hit/miss signal indicating whether the data block corresponding to the address value is stored in the cache memory. The cache controller has two modes of operation, including a first standard mode of operation in which read/write access to the cache memory is preceded by generation of the hit/miss signal by the comparator, and a second accelerated mode of operation in which read/write access to the cache memory is initiated without waiting for the comparator to process the access request's address value. The first mode of operation is used for all access requests by the data processor and for system controller access requests when the mode flag has a first value. The second mode of operation is used for the system controller access requests when the mode flag has a second value distinct from the first value.

    摘要翻译: 多处理器计算机系统具有耦合到系统控制器的数据处理器和主存储器。 每个数据处理器都有一个缓存存储器。 每个高速缓冲存储器具有一个具有两个用于接收访问请求的端口的缓存控制器。 第一端口从相关联的数据处理器接收访问请求,第二端口从系统控制器接收访问请求。 所有高速缓存存储器访问请求都包含一个地址值; 来自系统控制器的访问请求还包括模式标志。 高速缓存控制器中的比较器处理每个访问请求中的地址值,并产生指示与地址值相对应的数据块是否存储在高速缓冲存储器中的命中/未命中信号。 高速缓存控制器具有两种操作模式,包括第一标准操作模式,其中先前通过比较器生成命中/未命中信号,其中对高速缓冲存储器的读/写访问以及其中读取的第二加速操作模式 启动对高速缓冲存储器的写入访问,而不必等待比较器处理访问请求的地址值。 当模式标志具有第一个值时,第一种操作模式用于数据处理器和系统控制器访问请求的所有访问请求。 当模式标志具有与第一值不同的第二值时,第二操作模式用于系统控制器访问请求。