Credit-based streaming multiprocessor warp scheduling
    1.
    发明授权
    Credit-based streaming multiprocessor warp scheduling 有权
    基于信用流的多处理器扭曲调度

    公开(公告)号:US09189242B2

    公开(公告)日:2015-11-17

    申请号:US12885299

    申请日:2010-09-17

    IPC分类号: G06F9/50 G06F9/38

    摘要: One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.

    摘要翻译: 本发明的一个实施例提出了一种用于确保高速缓存访​​问指令被调度用于在多线程系统中执行以提高高速缓存位置和系统性能的技术。 可以使用基于信用的技术来对组中的每个翘曲的指令调度来控制指令,使得一组经线被均匀地处理。 对每个经纱计算信用额度,并且信用额度有助于每个经线的权重。 权重用于选择要执行的经纱的说明。

    Credit-Based Streaming Multiprocessor Warp Scheduling
    2.
    发明申请
    Credit-Based Streaming Multiprocessor Warp Scheduling 有权
    基于信用流的多处理器整流器调度

    公开(公告)号:US20110072244A1

    公开(公告)日:2011-03-24

    申请号:US12885299

    申请日:2010-09-17

    IPC分类号: G06F9/38 G06F9/312

    摘要: One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.

    摘要翻译: 本发明的一个实施例提出了一种用于确保高速缓存访​​问指令被调度用于在多线程系统中执行以提高高速缓存位置和系统性能的技术。 可以使用基于信用的技术来对组中的每个翘曲的指令调度来控制指令,使得一组经线被均匀地处理。 对每个经纱计算信用额度,并且信用额度有助于每个经线的权重。 权重用于选择要执行的经线的指令。

    Operand collector architecture
    4.
    发明授权
    Operand collector architecture 有权
    操作数收集架构

    公开(公告)号:US07834881B2

    公开(公告)日:2010-11-16

    申请号:US11555649

    申请日:2006-11-01

    IPC分类号: G09G5/36 G09G5/39 G06F15/80

    摘要: An apparatus and method for simulating a multi-ported memory using lower port count memories as banks. A collector units gather source operands from the banks as needed to process program instructions. The collector units also gather constants that are used as operands. When all of the source operands needed to process a program instruction have been gathered, a collector unit outputs the source operands to an execution unit while avoiding writeback conflicts to registers specified by the program instruction that may be accessed by other execution units.

    摘要翻译: 一种使用较低端口计数存储器作为存储体来模拟多端口存储器的装置和方法。 收集器单元根据需要从银行收集源操作数,以处理程序指令。 收集器单元还收集用作操作数的常量。 当收集处理程序指令所需的所有源操作数时,收集器单元将源操作数输出到执行单元,同时避免与由其他执行单元访问的程序指令指定的寄存器的写回冲突。

    Programmable graphics processor for multithreaded execution of programs
    5.
    发明授权
    Programmable graphics processor for multithreaded execution of programs 有权
    用于多线程执行程序的可编程图形处理器

    公开(公告)号:US08405665B2

    公开(公告)日:2013-03-26

    申请号:US13466043

    申请日:2012-05-07

    CPC分类号: G06T15/005

    摘要: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    摘要翻译: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

    PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS
    6.
    发明申请
    PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS 有权
    可编程图形处理程序,用于多方案执行程序

    公开(公告)号:US20120218267A1

    公开(公告)日:2012-08-30

    申请号:US13466043

    申请日:2012-05-07

    IPC分类号: G06T17/20 G06T1/20 G06T1/00

    CPC分类号: G06T15/005

    摘要: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    摘要翻译: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

    OPERAND COLLECTOR ARCHITECTURE
    7.
    发明申请
    OPERAND COLLECTOR ARCHITECTURE 有权
    操作收集架构

    公开(公告)号:US20080109611A1

    公开(公告)日:2008-05-08

    申请号:US11555649

    申请日:2006-11-01

    IPC分类号: G06F13/00

    摘要: An apparatus and method for simulating a multi-ported memory using lower port count memories as banks. A collector units gather source operands from the banks as needed to process program instructions. The collector units also gather constants that are used as operands. When all of the source operands needed to process a program instruction have been gathered, a collector unit outputs the source operands to an execution unit while avoiding writeback conflicts to registers specified by the program instruction that may be accessed by other execution units.

    摘要翻译: 一种使用较低端口计数存储器作为存储体来模拟多端口存储器的装置和方法。 收集器单元根据需要从银行收集源操作数,以处理程序指令。 收集器单元还收集用作操作数的常量。 当收集处理程序指令所需的所有源操作数时,收集器单元将源操作数输出到执行单元,同时避免与由其他执行单元访问的程序指令指定的寄存器的写回冲突。

    A Programmable Graphics Processor For Multithreaded Execution of Programs
    8.
    发明申请
    A Programmable Graphics Processor For Multithreaded Execution of Programs 有权
    用于多线程执行程序的可编程图形处理器

    公开(公告)号:US20080024506A1

    公开(公告)日:2008-01-31

    申请号:US11458633

    申请日:2006-07-19

    IPC分类号: G06T1/20

    摘要: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    摘要翻译: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

    Programmable graphics processor for multithreaded execution of programs
    9.
    发明授权
    Programmable graphics processor for multithreaded execution of programs 有权
    用于多线程执行程序的可编程图形处理器

    公开(公告)号:US08174531B1

    公开(公告)日:2012-05-08

    申请号:US12649201

    申请日:2009-12-29

    CPC分类号: G06T15/005

    摘要: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    摘要翻译: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

    Shared single-access memory with management of multiple parallel requests
    10.
    发明授权
    Shared single-access memory with management of multiple parallel requests 有权
    具有管理多个并行请求的共享单访问存储器

    公开(公告)号:US08645638B2

    公开(公告)日:2014-02-04

    申请号:US13466057

    申请日:2012-05-07

    IPC分类号: G06F12/00 G06F13/00

    CPC分类号: G06F12/084 Y02D10/13

    摘要: A memory is used by concurrent threads in a multithreaded processor. Any addressable storage location is accessible by any of the concurrent threads, but only one location at a time is accessible. The memory is coupled to parallel processing engines that generate a group of parallel memory access requests, each specifying a target address that might be the same or different for different requests. Serialization logic selects one of the target addresses and determines which of the requests specify the selected target address. All such requests are allowed to proceed in parallel, while other requests are deferred. Deferred requests may be regenerated and processed through the serialization logic so that a group of requests can be satisfied by accessing each different target address in the group exactly once.

    摘要翻译: 多线程处理器中的并发线程使用内存。 任何可寻址的存储位置都可以由任何并发线程访问,但一次只能访问一个位置。 存储器耦合到并行处理引擎,其产生一组并行存储器访问请求,每个指定对于不同请求可能相同或不同的目标地址。 序列化逻辑选择一个目标地址,并确定哪个请求指定所选择的目标地址。 允许所有这些请求并行进行,而其他请求被推迟。 可以通过序列化逻辑重新生成和处理延迟请求,以便通过一次访问组中的每个不同的目标地址来满足一组请求。