PERFORMANCE OF ACCESSES FROM MULTIPLE PROCESSORS TO A SAME MEMORY LOCATION
    1.
    发明申请
    PERFORMANCE OF ACCESSES FROM MULTIPLE PROCESSORS TO A SAME MEMORY LOCATION 有权
    从多个处理器到同一个存储位置的访问性能

    公开(公告)号:US20150032970A1

    公开(公告)日:2015-01-29

    申请号:US13949434

    申请日:2013-07-24

    Applicant: Arm Limited

    Abstract: A processing apparatus comprising: several processors for processing data; a hierarchical memory system comprising a memory accessible to all the processors, and several caches corresponding to each of the processors, each of the caches being accessible to the corresponding processor and comprising storage locations and corresponding indicators. There is also cache coherency control circuitry for maintaining coherency of data stored in the hierarchical memory system. The processors are configured to respond to receipt of a predefined request to perform an operation on a data item to determine if the cache corresponding to the processor receiving the request has a storage location allocated to the data item. If not, the processing apparatus is configured to: allocate a storage location within the cache to the data item, set the indicator corresponding to the storage location to indicate that the storage location is storing a delta value, set data in the allocated storage location to an initial value. The processor is configured in response to the predefined request to perform the operation on data within the storage location allocated to the data item.

    Abstract translation: 一种处理装置,包括:用于处理数据的几个处理器; 包括对所有处理器可访问的存储器以及对应于每个处理器的多个高速缓存的分级存储器系统,每个高速缓存可由对应的处理器访问,并且包括存储位置和对应的指示符。 还存在用于维持分层存储器系统中存储的数据的一致性的高速缓存一致性控制电路。 处理器被配置为响应于接收到对数据项执行操作的预定义请求,以确定与接收到请求的处理器相对应的高速缓存是否具有分配给数据项的存储位置。 如果不是,则处理装置被配置为:将缓存内的存储位置分配给数据项,设置与存储位置相对应的指示符,以指示存储位置正在存储增量值,将分配的存储位置中的数据设置为 一个初始值。 处理器被配置为响应于对分配给数据项的存储位置内的数据执行操作的预定义请求。

    DATA PROCESSING SYSTEMS
    2.
    发明申请

    公开(公告)号:US20170206698A1

    公开(公告)日:2017-07-20

    申请号:US15401639

    申请日:2017-01-09

    Applicant: ARM Limited

    CPC classification number: G06T15/005 G06F8/41 G06F8/454 G06F8/458 G06T15/80

    Abstract: A graphics processing unit comprises a programmable execution unit executing graphics processing programs for execution threads to perform graphics processing operations, a local register memory comprising one or more registers, where registers of the register memory are assignable to store data associated with an individual execution thread that is being executed by the execution unit, and where the register(s) assigned to an individual execution thread are accessible only to that associated individual execution thread, and a further local memory that is operable to store data for use in common by plural execution threads, where the data stored in the further local memory is accessible to plural execution threads as they execute. The programmable execution unit is operable to selectively store output data for an execution thread in a register(s) of the local register memory assigned to the execution thread, and the further local memory.

    Data processing systems
    3.
    发明授权

    公开(公告)号:US10725784B2

    公开(公告)日:2020-07-28

    申请号:US15197666

    申请日:2016-06-29

    Applicant: ARM Limited

    Abstract: A data processing system has an execution pipeline with programmable execution stages which execute instructions to perform data processing operations provided by a host processor and in which execution threads are grouped together into groups in which the threads are executed in lockstep. The system also includes a compiler that compiles programs to generate instructions for the execution stages. The compiler is configured to, for an operation that comprises a memory transaction: issue to the execution stage instructions for executing the operation for the thread group to: perform the operation for the thread group as a whole; and provide the result of the operation to all the active threads of the group. At least one execution stage is configured to, in response to the instructions: perform the operation for the thread group as a whole; and provide the result of the operation to all the active threads of the group.

    DATA PROCESSING SYSTEMS
    4.
    发明申请
    DATA PROCESSING SYSTEMS 审中-公开
    数据处理系统

    公开(公告)号:US20170003972A1

    公开(公告)日:2017-01-05

    申请号:US15197666

    申请日:2016-06-29

    Applicant: ARM Limited

    Abstract: A data processing system has an execution pipeline with programmable execution stages which execute instructions to perform data processing operations provided by a host processor and in which execution threads are grouped together into groups in which the threads are executed in lockstep. The system also includes a compiler that compiles programs to generate instructions for the execution stages. The compiler is configured to, for an operation that comprises a memory transaction: issue to the execution stage instructions for executing the operation for the thread group to: perform the operation for the thread group as a whole; and provide the result of the operation to all the active threads of the group. At least one execution stage is configured to, in response to the instructions: perform the operation for the thread group as a whole; and provide the result of the operation to all the active threads of the group.

    Abstract translation: 数据处理系统具有可编程执行阶段的执行流水线,其执行指令以执行由主处理器提供的数据处理操作,并且其中执行线程被分组在一起,其中在锁步骤中执行线程。 该系统还包括编译程序以生成执行阶段的指令的编译器。 编译器被配置为对于包括存储器事务的操作:向执行阶段发出用于执行线程组的操作的指令:对于线程组作为整体执行操作; 并将该操作的结果提供给组的所有活动线程。 至少一个执行阶段被配置为响应于指令:对整个线程组执行操作; 并将该操作的结果提供给组的所有活动线程。

    Data processing systems
    5.
    发明授权

    公开(公告)号:US10115222B2

    公开(公告)日:2018-10-30

    申请号:US15401639

    申请日:2017-01-09

    Applicant: ARM Limited

    Abstract: A graphics processing unit comprises a programmable execution unit executing graphics processing programs for execution threads to perform graphics processing operations, a local register memory comprising one or more registers, where registers of the register memory are assignable to store data associated with an individual execution thread that is being executed by the execution unit, and where the register(s) assigned to an individual execution thread are accessible only to that associated individual execution thread, and a further local memory that is operable to store data for use in common by plural execution threads, where the data stored in the further local memory is accessible to plural execution threads as they execute. The programmable execution unit is operable to selectively store output data for an execution thread in a register(s) of the local register memory assigned to the execution thread, and the further local memory.

    Performance of accesses from multiple processors to a same memory location
    6.
    发明授权
    Performance of accesses from multiple processors to a same memory location 有权
    从多个处理器访问同一内存位置的性能

    公开(公告)号:US09146870B2

    公开(公告)日:2015-09-29

    申请号:US13949434

    申请日:2013-07-24

    Applicant: ARM LIMITED

    Abstract: A processing apparatus comprising: several processors for processing data; a hierarchical memory system comprising a memory accessible to all the processors, and several caches corresponding to each of the processors, each of the caches being accessible to the corresponding processor and comprising storage locations and corresponding indicators. There is also cache coherency control circuitry for maintaining coherency of data stored in the hierarchical memory system. The processors are configured to respond to receipt of a predefined request to perform an operation on a data item to determine if the cache corresponding to the processor receiving the request has a storage location allocated to the data item. If not, the processing apparatus is configured to: allocate a storage location within the cache to the data item, set the indicator corresponding to the storage location to indicate that the storage location is storing a delta value, set data in the allocated storage location to an initial value. The processor is configured in response to the predefined request to perform the operation on data within the storage location allocated to the data item.

    Abstract translation: 一种处理装置,包括:用于处理数据的几个处理器; 包括对所有处理器可访问的存储器以及对应于每个处理器的多个高速缓存的分级存储器系统,每个高速缓存可由对应的处理器访问,并且包括存储位置和对应的指示符。 还存在用于维持分层存储器系统中存储的数据的一致性的高速缓存一致性控制电路。 处理器被配置为响应于接收到对数据项执行操作的预定义请求,以确定与接收到请求的处理器相对应的高速缓存是否具有分配给数据项的存储位置。 如果不是,则处理装置被配置为:将缓存内的存储位置分配给数据项,设置与存储位置相对应的指示符,以指示存储位置正在存储增量值,将分配的存储位置中的数据设置为 一个初始值。 处理器被配置为响应于对分配给数据项的存储位置内的数据执行操作的预定义请求。

Patent Agency Ranking