SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS
    64.
    发明申请
    SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS 审中-公开
    使用优先计算单元的系统性能管理

    公开(公告)号:US20170004080A1

    公开(公告)日:2017-01-05

    申请号:US14755401

    申请日:2015-06-30

    CPC classification number: G06F12/084 G06F2212/1021

    Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

    Abstract translation: 用于管理具有多个计算单元的处理器的性能的方法,设备和系统。 可以确定多个计算单元的有效数量以指定为优先级。 在有效数字为非零的条件下,多个计算单元的有效数量可以各自被指定为优先计算单元。 优先计算单元可以访问共享高速缓存,而非优先级计算单元可能不具有访问权限。 工作组可以优先地分派到优先计算单元。 来自优先计算单元的存储器访问请求可以在来自非优先级计算单元的请求之前提供。

    CONDITIONAL ATOMIC OPERATIONS AT A PROCESSOR
    65.
    发明申请
    CONDITIONAL ATOMIC OPERATIONS AT A PROCESSOR 审中-公开
    处理者的条件原子操作

    公开(公告)号:US20160357551A1

    公开(公告)日:2016-12-08

    申请号:US14728643

    申请日:2015-06-02

    Abstract: A conditional fetch-and-phi operation tests a memory location to determine if the memory locations stores a specified value and, if so, modifies the value at the memory location. The conditional fetch-and-phi operation can be implemented so that it can be concurrently executed by a plurality of concurrently executing threads, such as the threads of wavefront at a GPU. To execute the conditional fetch-and-phi operation, one of the concurrently executing threads is selected to execute a compare-and-swap (CAS) operation at the memory location, while the other threads await the results. The CAS operation tests the value at the memory location and, if the CAS operation is successful, the value is passed to each of the concurrently executing threads.

    Abstract translation: 条件获取和操作操作测试存储器位置以确定存储器位置是否存储指定的值,如果是,则修改存储器位置处的值。 可以实现条件获取和操作操作,使得其可以由多个并发执行的线程(诸如GPU处的波阵面的线程)同时执行。 为了执行条件提取和操作操作,选择并发执行的线程之一,以在存储器位置执行比较和交换(CAS)操作,而其他线程等待结果。 CAS操作测试内存位置的值,如果CAS操作成功,则将该值传递给每个并发执行的线程。

    Creating SIMD efficient code by transferring register state through common memory
    66.
    发明授权
    Creating SIMD efficient code by transferring register state through common memory 有权
    通过公共存储器传送寄存器状态来创建SIMD高效代码

    公开(公告)号:US09354892B2

    公开(公告)日:2016-05-31

    申请号:US13689421

    申请日:2012-11-29

    CPC classification number: G06F9/3887 G06F9/3851

    Abstract: Methods, media, and computing systems are provided. The method includes, the media are configured for, and the computing system includes a processor with control logic for allocating memory for storing a plurality of local register states for work items to be executed in single instruction multiple data hardware and for repacking wavefronts that include work items associated with a program instruction responsive to a conditional statement. The repacking is configured to create repacked wavefronts that include at least one of a wavefront containing work items that all pass the conditional statement and a wavefront containing work items that all fail the conditional statement.

    Abstract translation: 提供了方法,媒体和计算系统。 该方法包括:媒体被配置用于计算系统,并且计算系统包括具有控制逻辑的处理器,该控制逻辑用于分配存储器,用于存储要在单指令多数据硬件中执行的工作项的多个本地寄存器状态,以及用于重新包装工作的波前 与响应于条件语句的程序指令相关联的项目。 重新配置被配置为创建重新包装的波前,其包括包含工作项的波前中的至少一个,所述工作项全部通过条件语句,以及包含所有未完成条件语句的工作项的波阵面。

    High level software execution mask override
    67.
    发明授权
    High level software execution mask override 有权
    高级软件执行掩码覆盖

    公开(公告)号:US09317296B2

    公开(公告)日:2016-04-19

    申请号:US13725063

    申请日:2012-12-21

    CPC classification number: G06F9/3887 G06F9/30036

    Abstract: Methods, and media, and computer systems are provided. The method includes, the media includes control logic for, and the computer system includes a processor with control logic for overriding an execution mask of SIMD hardware to enable at least one of a plurality of lanes of the SIMD hardware. Overriding the execution mask is responsive to a data parallel computation and a diverged control flow of a workgroup.

    Abstract translation: 提供了方法,媒体和计算机系统。 该方法包括:媒体包括用于的控制逻辑,并且计算机系统包括具有用于覆盖SIMD硬件的执行掩码的控制逻辑的处理器,以使能SIMD硬件的多个通道中的至少一个。 覆盖执行掩码响应于数据并行计算和工作组的分散控制流。

    MOVING DATA BETWEEN CACHES IN A HETEROGENEOUS PROCESSOR SYSTEM
    68.
    发明申请
    MOVING DATA BETWEEN CACHES IN A HETEROGENEOUS PROCESSOR SYSTEM 有权
    移动异步处理器系统中的缓存之间的数据

    公开(公告)号:US20160041909A1

    公开(公告)日:2016-02-11

    申请号:US14452058

    申请日:2014-08-05

    Abstract: Apparatus, computer readable medium, integrated circuit, and method of moving a plurality of data items to a first cache or a second cache are presented. The method includes receiving an indication that the first cache requested the plurality of data items. The method includes storing information indicating that the first cache requested the plurality of data items. The information may include an address for each of the plurality of data items. The method includes determining based at least on the stored information to move the plurality of data items to the second cache. The method includes moving the plurality of data items to the second cache. The method may include determining a time interval between receiving the indication that the first cache requested the plurality of data items and moving the plurality of data items to the second cache. A scratch pad memory is disclosed.

    Abstract translation: 呈现装置,计算机可读介质,集成电路和将多个数据项移动到第一高速缓存或第二高速缓存的方法。 该方法包括接收第一缓存请求多个数据项的指示。 所述方法包括存储指示所述第一高速缓存请求所述多个数据项的信息。 该信息可以包括多个数据项中的每一个的地址。 该方法包括至少基于存储的信息确定以将多个数据项移动到第二高速缓存。 该方法包括将多个数据项移动到第二高速缓存。 所述方法可以包括确定在接收到所述第一高速缓存请求所述多个数据项之间并且将所述多个数据项移动到所述第二高速缓存的指示之间的时间间隔。 公开了一种临时存储器。

    DATA PROCESSOR AND METHOD OF LANE REALIGNMENT
    69.
    发明申请
    DATA PROCESSOR AND METHOD OF LANE REALIGNMENT 审中-公开
    数据处理器和LANE实现方法

    公开(公告)号:US20150100758A1

    公开(公告)日:2015-04-09

    申请号:US14045114

    申请日:2013-10-03

    Abstract: A data processor includes a register file divided into at least a first portion and a second portion for storing data. A single instruction, multiple data (SIMD) unit is also divided into at least a first lane and a second lane. The first and second lanes of the SIMD unit correspond respectively to the first and second portions of the register file. Furthermore, each lane of the SIMD unit is capable of data processing. The data processor also includes a realignment element in communication with the register file and the SIMD unit. The realignment element is configured to selectively realign conveyance of data between the first portion of the register file and the first lane of the SIMD unit to the second lane of the SIMD unit.

    Abstract translation: 数据处理器包括被分成至少第一部分的寄存器文件和用于存储数据的第二部分。 单指令,多数据(SIMD)单元也被划分为至少第一通道和第二通道。 SIMD单元的第一和第二通道分别对应于寄存器文件的第一和第二部分。 此外,SIMD单元的每个通道能够进行数据处理。 数据处理器还包括与寄存器文件和SIMD单元通信的重新对准元件。 重新对准元件被配置为选择性地将寄存器文件的第一部分与SIMD单元的第一通道之间的数据传送到SIMD单元的第二通道。

    METHOD FOR MEMORY CONSISTENCY AMONG HETEROGENEOUS COMPUTER COMPONENTS
    70.
    发明申请
    METHOD FOR MEMORY CONSISTENCY AMONG HETEROGENEOUS COMPUTER COMPONENTS 有权
    在异构计算机组件中存储器一致的方法

    公开(公告)号:US20140337587A1

    公开(公告)日:2014-11-13

    申请号:US14275271

    申请日:2014-05-12

    Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model . For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.

    Abstract translation: 描述了一种方法,计算机程序产品和系统,其确定在具有异构计算机组件的计算设备中使用存储器操作的正确性。 实施例包括基于用于异构无竞争(SC for HRF)的顺序一致性的特性的优化器,该模型分析程序并确定程序中的事件的顺序的正确性。 HRF模型包括属性的组合:范围顺序,范围包含和范围传递性。 优化器可以根据HR对HRF内存一致性模型的SC来确定程序何时是异构无竞争的。 例如,优化器可以分析程序代码的一部分,尊重SC的HRF模型的属性,并且确定由存储器存储器事件产生的值是否将是由加载存储器事件观察到的值的候选。 此外,优化器可以确定是否可能重新排序事件。

Patent Agency Ranking