RECONFIGURABLE VIRTUAL GRAPHICS AND COMPUTE PROCESSOR PIPELINE

    公开(公告)号:US20180114290A1

    公开(公告)日:2018-04-26

    申请号:US15331278

    申请日:2016-10-21

    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.

    Method and system for synchronization of workitems with divergent control flow
    42.
    发明授权
    Method and system for synchronization of workitems with divergent control flow 有权
    工作单元与发散控制流同步的方法和系统

    公开(公告)号:US09424099B2

    公开(公告)日:2016-08-23

    申请号:US13672291

    申请日:2012-11-08

    CPC classification number: G06F9/52 G06F9/522

    Abstract: Disclosed methods, systems, and computer program products embodiments include synchronizing a group of workitems on a processor by storing a respective program counter associated with each of the workitems, selecting at least one first workitem from the group for execution, and executing the selected at least one first workitem on the processor. The selecting is based upon the respective stored program counter associated with the at least one first workitem.

    Abstract translation: 公开的方法,系统和计算机程序产品实施例包括通过存储与每个工作项相关联的相应程序计数器来同步处理器上的一组工作项,从组中选择至少一个第一工作以供执行,以及执行所选择的至少 处理器上的第一个工作。 所述选择基于与所述至少一个第一工作项相关联的相应存储的程序计数器。

    ACCELERATION UNIT WITH MODULAR ARCHITECTURE

    公开(公告)号:US20250117352A1

    公开(公告)日:2025-04-10

    申请号:US18910202

    申请日:2024-10-09

    Abstract: A processing system includes one or more accelerator units (AUs) each having a modular architecture. To this end, each AU includes a connection circuitry and one or more memory stacks disposed on the connection circuitry. Further, each AU includes one or more interposer dies each disposed on the connection circuitry such that each interposer die of the one or more interposer dies is communicatively coupled to a corresponding memory stack of the memory stacks via the connection circuitry. Further, each interposer die of each AU includes circuitry configured to concurrently support two or more types of compute dies.

    Spatial partitioning in a multi-tenancy graphics processing unit

    公开(公告)号:US12205218B2

    公开(公告)日:2025-01-21

    申请号:US17706811

    申请日:2022-03-29

    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

    FUSED MULTIMODAL FRAMEWORK FOR NON-PLAYER CHARACTER GENERATION AND CONFIGURATION

    公开(公告)号:US20240424398A1

    公开(公告)日:2024-12-26

    申请号:US18748920

    申请日:2024-06-20

    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data. The face vertex displacement data, joint trajectory data, and speech data are used to produce an animated representation of the NPC, which is then provided to environment-specific adapters to animate the NPC within a virtual digital environment.

    Sparse matrix-vector multiplication

    公开(公告)号:US11995149B2

    公开(公告)日:2024-05-28

    申请号:US17125457

    申请日:2020-12-17

    CPC classification number: G06F17/16 G06F9/30098 G06F15/80

    Abstract: A processing system includes a first set and a second set of general-purpose registers (GPRs) and memory access circuitry that fetches nonzero values of a sparse matrix into consecutive slots in the first set. The memory access circuitry also fetches values of an expanded matrix into consecutive slots in the second set of GPRs. The expanded matrix is formed based on values of a vector and locations of the nonzero values in the sparse matrix. The processing system also includes a set of multipliers that concurrently perform multiplication of the nonzero values in slots of the first set of GPRs with the values of the vector in corresponding slots of the second set. Reduced sum circuitry accumulates results from the set of multipliers for rows of the sparse matrix.

    Dynamically adaptable arrays for vector and matrix operations

    公开(公告)号:US11409840B2

    公开(公告)日:2022-08-09

    申请号:US17032314

    申请日:2020-09-25

    Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that are dynamically mapped to mutually exclusive subsets of the rows and columns of the processor element arrays based on dimensions of matrices that provide the parameter values to the processor element arrays. In some cases, the processor element arrays are vector arithmetic logic unit (ALU) processors and the memory interfaces are direct memory access (DMA) engines. The rows of the processor element arrays in the subsets are mutually exclusive to the rows in the other subsets and the columns of the processor element arrays in the subsets are mutually exclusive to the columns in the other subsets. The matrices can be symmetric or asymmetric, e.g., one of the matrices can be a vector having a single column.

Patent Agency Ranking