Mechanism to increase thread parallelism in a graphics processor

    公开(公告)号:US10552211B2

    公开(公告)日:2020-02-04

    申请号:US15255553

    申请日:2016-09-02

    Abstract: A processing apparatus is described. The apparatus includes a plurality of execution threads having a first thread space configuration including a first plurality of rows of execution threads to process data in parallel, wherein each thread in a row is dependent on a top neighbor thread in a preceding row, partition logic to partition the plurality of execution threads into a plurality of banks, wherein each bank includes one or more of the first plurality of rows of execution threads and transform logic to transform the first thread space configuration to a second thread space configuration including a second plurality of rows of execution threads to enable the plurality of execution threads in each of the plurality of banks to operate in parallel.

    EVENT-DRIVEN FRAMEWORK FOR GPU PROGRAMMING
    3.
    发明申请

    公开(公告)号:US20180329762A1

    公开(公告)日:2018-11-15

    申请号:US15778109

    申请日:2015-12-25

    CPC classification number: G06F9/542 G06F9/44 G06F9/4411

    Abstract: Methods and apparatus relating to event-driven framework for GPU (Graphics Processing Unit) programming are described. In an embodiment, event-driven logic receives a signal that indicates detection of an event by a device. Memory stores information corresponding to a kernel that is to be associated with the event. The event-driven logic causes a Graphics Processing Unit (GPU) to execute the kernel to process one or more operations in response to the event. Other embodiments are also disclosed and claimed.

    Mechanism to Increase Thread Parallelism in a Graphics Processor

    公开(公告)号:US20180067763A1

    公开(公告)日:2018-03-08

    申请号:US15255553

    申请日:2016-09-02

    CPC classification number: G06F9/4881 G06F9/52 G06T1/20 G06T2200/28

    Abstract: A processing apparatus is described. The apparatus includes a plurality of execution threads having a first thread space configuration including a first plurality of rows of execution threads to process data in parallel, wherein each thread in a row is dependent on a top neighbor thread in a preceding row, partition logic to partition the plurality of execution threads into a plurality of banks, wherein each bank includes one or more of the first plurality of rows of execution threads and transform logic to transform the first thread space configuration to a second thread space configuration including a second plurality of rows of execution threads to enable the plurality of execution threads in each of the plurality of banks to operate in parallel.

Patent Agency Ranking