A DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING LOCK-PROTECTED PROCESSING OPERATIONS FOR MULTIPLE THREADS

    公开(公告)号:US20170139757A1

    公开(公告)日:2017-05-18

    申请号:US15322882

    申请日:2015-05-19

    Applicant: ARM LIMITED

    Abstract: A data processing apparatus and method are provided for executing a plurality of threads. Processing circuitry performs processing operations required by the plurality of threads, the processing operations including a lock-protected processing operation with which a lock is associated, where the lock needs to be acquired before the processing circuitry performs the lock-protected processing operation. Baton maintenance circuitry is used to maintain a baton in association with the plurality of threads, the baton forming a proxy for the lock, and the baton maintenance circuitry being configured to allocate the baton between the threads. Via communication between the processing circuitry and the baton maintenance circuitry, once the lock has been acquired for one of the threads, the processing circuitry performs the lock-protected processing operation for multiple threads before the lock is released, with the baton maintenance circuitry identifying a current thread amongst the multiple threads for which the lock-protected processing operation is to be performed by allocating the baton to that current thread. The baton can hence be passed from one thread to the next, without needing to release and re-acquire the lock. This provides a significant performance improvement when performing lock-protected processing operations across multiple threads.

    CONFIGURABLE THREAD ORDERING FOR THROUGHPUT COMPUTING DEVICES
    2.
    发明申请
    CONFIGURABLE THREAD ORDERING FOR THROUGHPUT COMPUTING DEVICES 有权
    用于通过计算机设备的可配置的螺纹订购

    公开(公告)号:US20150160982A1

    公开(公告)日:2015-06-11

    申请号:US14557935

    申请日:2014-12-02

    Applicant: ARM Limited

    CPC classification number: G06F9/5016 G06F9/3009 G06F9/4881 G06F9/5027

    Abstract: A data processing apparatus and method processing data are disclosed. Execution circuitry is configured to execute multiple threads to perform data processing on input data by reference to at least one coordinate value of points in a reference domain. Thread allocation circuitry is configured to specify a selected point in the reference domain for each thread of the multiple threads respectively in order to allocate the data processing by specifying for each thread the at least one coordinate value of the specified point for that thread. Each thread accesses the input data with reference to its selected point in the reference domain and an order in which points in the reference domain are allocated to threads for data processing is configurable in the thread allocation circuitry.

    Abstract translation: 公开了一种数据处理装置和方法处理数据。 执行电路被配置为通过参考参考域中的点的至少一个坐标值来执行多个线程以对输入数据执行数据处理。 线程分配电路被配置为分别为多个线程的每个线程指定参考域中的选定点,以便通过为每个线程指定该线程的指定点的至少一个坐标值来分配数据处理。 每个线程参考参考域中的选定点访问输入数据,并且在线程分配电路中可配置参考域中的点被分配给线程进行数据处理的顺序。

    APPARATUS AND METHOD FOR EXECUTING A PLURALITY OF THREADS
    3.
    发明申请
    APPARATUS AND METHOD FOR EXECUTING A PLURALITY OF THREADS 审中-公开
    用于执行大量螺纹的装置和方法

    公开(公告)号:US20160259668A1

    公开(公告)日:2016-09-08

    申请号:US15058389

    申请日:2016-03-02

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3851 G06F9/46 G06F15/16 G06T1/20

    Abstract: An apparatus and method are provided for executing a plurality of threads. The apparatus has processing circuitry arranged to execute the plurality of threads, with each thread executing a program to perform processing operations on thread data. Each thread has a thread identifier, and the thread data includes a value which is dependent on the thread identifier. Value generator circuitry is provided to perform a computation using the thread identifier of a chosen thread in order to generate the above mentioned value for that chosen thread, and to make that value available to the processing circuitry for use by the processing circuitry when executing the chosen thread. Such an arrangement can give rise to significant performance benefits when executing the plurality of threads on the apparatus.

    Abstract translation: 提供一种用于执行多个线程的装置和方法。 该装置具有被布置为执行多个线程的处理电路,每个线程执行程序以对线程数据执行处理操作。 每个线程都有一个线程标识符,线程数据包含一个取决于线程标识符的值。 提供值生成器电路以使用所选线程的线程标识符执行计算,以便为所选择的线程生成上述值,并且使得该值可用于处理电路,以便在执行所选择的线程时由处理电路使用 线。 当在设备上执行多个线程时,这种布置可以产生显着的性能益处。

    DATA PROCESSING APPARATUS FOR EXECUTING AN ACCESS INSTRUCTION FOR N THREADS
    4.
    发明申请
    DATA PROCESSING APPARATUS FOR EXECUTING AN ACCESS INSTRUCTION FOR N THREADS 审中-公开
    用于执行N个线程的访问指令的数据处理设备

    公开(公告)号:US20150261538A1

    公开(公告)日:2015-09-17

    申请号:US14643018

    申请日:2015-03-10

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3012 G06F9/30123 G06F9/3824 G06F9/3851

    Abstract: A data processing apparatus 10 for executing an access instruction for n threads in order to access data values for the n threads includes storage circuitry 100 that stores data values associated with the n threads in groups defined by storage boundaries. The data processing apparatus also includes processing circuitry 80 that processes the access instruction for a set of threads at a time (where each set of threads comprises fewer than n threads) and splitting circuitry 110, responsive to the access instruction, to divide the n threads into multiple sets of threads, and to generate at least one control signal identifying the multiple sets. For each of the sets, the processing circuitry responds to the at least one control signal by issuing at least one access request to the storage circuitry in order to access the data values for that set. The splitting circuitry determines into which set each of the n threads is allocated having regards to the storage boundaries.

    Abstract translation: 用于执行n个线程的访问指令以便访问n个线程的数据值的数据处理装置10包括存储电路100,其存储与由存储边界定义的组中的n个线程相关联的数据值。 数据处理装置还包括处理电路80,处理电路80响应于访问指令一次(其中每组线程包括少于n个线程)处理一组线程的访问指令,并且分割电路110将n个线程 并且生成至少一个标识多个集合的控制信号。 对于每个组,处理电路通过向存储电路发出至少一个访问请求来响应该至少一个控制信号,以访问该组的数据值。 分割电路确定对n个线程中的每一个被分配的哪个集合涉及存储边界。

Patent Agency Ranking