THREAD MODIFICATION TO REDUCE COMMAND CONVERSION LATENCY

    公开(公告)号:US20210382765A1

    公开(公告)日:2021-12-09

    申请号:US16896031

    申请日:2020-06-08

    Abstract: Examples described herein relate to a graphics processing apparatus that includes a memory device; and a central processing unit (CPU). In some examples, the CPU is configured to: execute a producer to issue graphics command application program interfaces (APIs); execute a driver to translate graphics command APIs into executable instructions; and based on an idle state of the producer, execute a command translation code segment of the producer to translate graphics command APIs into executable instructions. In some examples, the execution unit is coupled to the memory device, the execution unit to execute one or more of the executable instructions. In some examples, the producer includes multiple portions such as application code, graphics pipeline runtime code, and command translation code segment.

    APPARATUS AND METHOD FOR ASYNCHRONOUS TILE-BASED RENDERING CONTROL
    2.
    发明申请
    APPARATUS AND METHOD FOR ASYNCHRONOUS TILE-BASED RENDERING CONTROL 有权
    用于非同步基于层次渲染控制的装置和方法

    公开(公告)号:US20160188491A1

    公开(公告)日:2016-06-30

    申请号:US14582790

    申请日:2014-12-24

    Inventor: Michael APODACA

    Abstract: An apparatus and method are described for asynchronous tile-based rendering control. In one embodiment of the invention, there is a delay between when the graphics driver queues the GPU commands for rendering and when the GPU begins executing. During this delay, the graphics driver receives additional information or data about whether cache evictions may be inhibited. As such, it allows the graphics driver to defer the cache eviction control of its render cache until it has this extra information. By doing so, it reduces the memory bandwidth required for rendering 3D graphics applications and in turn reduces the power consumption of the GPU.

    Abstract translation: 描述了基于异步瓦片的渲染控制的装置和方法。 在本发明的一个实施例中,当图形驱动程序对GPU命令进行排队以及GPU何时开始执行时,存在延迟。 在此延迟期间,图形驱动程序接收有关是否可以禁止高速缓存驱逐的附加信息或数据。 因此,它允许图形驱动器推迟其渲染高速缓存的缓存驱逐控制,直到它具有这个额外的信息。 通过这样做,它可以减少渲染3D图形应用程序所需的内存带宽,从而降低GPU的功耗。

    METHOD AND APPARATUS FOR EFFICIENT LOOP PROCESSING IN A GRAPHICS HARDWARE FRONT END

    公开(公告)号:US20220284539A1

    公开(公告)日:2022-09-08

    申请号:US17578125

    申请日:2022-01-18

    Abstract: Various embodiments enable loop processing in a command processing block of the graphics hardware. Such hardware may include a processor including a command buffer, and a graphics command parser. The graphics command parser to load graphics commands from the command buffer, parse a first graphics command, store a loop count value associated with the first graphics command, parse a second graphics command and store a loop wrap address based on the second graphics command. The graphics command parser may execute a command sequence identified by the second graphics command, parse a third graphics command, the third graphics command identifying an end of the command sequence, set a new loop count value, and iteratively execute the command sequence using the loop wrap address based on the new loop count value.

    CLOUD-BASED REALTIME RAYTRACING
    4.
    发明申请

    公开(公告)号:US20200211265A1

    公开(公告)日:2020-07-02

    申请号:US16236218

    申请日:2018-12-28

    Abstract: Cloud-based real time rendering. For example, one embodiment of a system comprises: a first graphics processing node to perform a first set of graphics processing operations to render a graphics scene, the first set of graphics processing operations comprising ray-tracing independent operations; an interconnect or network interface coupling the first graphics processing node to a second graphics processing node; the second graphics processing node to receive an indication of a current view of a user of the first graphics processing node and to receive or construct a view-independent surface generated by view-independent ray traversal and intersection operations; the second graphics processing node to responsively perform a view-dependent translation of the view-independent surface based on the current view of the user to generate a view-dependent surface and to provide the view-dependent surface to the first graphics processing node; and the first graphics processing node to perform a second set of graphics processing operations to complete rendering of the graphics scene using the view-dependent surface.

    VIRTUAL REALITY/AUGMENTED REALITY APPARATUS AND METHOD

    公开(公告)号:US20190317599A1

    公开(公告)日:2019-10-17

    申请号:US16453189

    申请日:2019-06-26

    Abstract: A virtual reality apparatus and method are described. For example, one embodiment of an apparatus comprises: a compute cluster comprising global illumination circuitry and/or logic to perform global illumination operations on graphics data in response to execution of a virtual reality application and to responsively generate a stream of samples; a filtering/compression module to perform filtering and/or compression operations on the stream of samples to generate filtered/compressed samples; a network interface to communicatively couple the compute cluster to a network, the filtered/compressed samples to be streamed over the network; a render node to receive the filtered/compressed samples streamed over the network, the render node comprising: decompression circuitry/logic to decompress the filtered/compressed samples to generate decompressed samples; a sample buffer to store the decompressed samples; and sample insertion circuitry/logic to asynchronously insert samples into a light field rendered by a light field rendering circuit/logic.

    APPARATUS AND METHOD FOR RAY TRACING INSTRUCTION PROCESSING AND EXECUTION

    公开(公告)号:US20230137438A1

    公开(公告)日:2023-05-04

    申请号:US18090810

    申请日:2022-12-29

    Abstract: An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.

    APPARATUS AND METHOD FOR ACCELERATION DATA STRUCTURE REFIT

    公开(公告)号:US20210012553A1

    公开(公告)日:2021-01-14

    申请号:US17032964

    申请日:2020-09-25

    Abstract: Apparatus and method for acceleration data structure refit. For example, one embodiment of an apparatus comprises: a ray generator to generate a plurality of rays in a first graphics scene; a hierarchical acceleration data structure generator to construct an acceleration data structure comprising a plurality of hierarchically arranged nodes including inner nodes and leaf nodes stored in a memory in a depth-first search (DFS) order; traversal hardware logic to traverse one or more of the rays through the acceleration data structure; intersection hardware logic to determine intersections between the one or more rays and one or more primitives within the hierarchical acceleration data structure; a node refit unit comprising circuitry and/or logic to read consecutively through at least the inner nodes in the memory in reverse DFS order to perform a bottom-up refit operation on the hierarchical acceleration data structure.

    APPARATUS AND METHOD FOR ACCELERATION DATA STRUCTURE REFIT

    公开(公告)号:US20230162428A1

    公开(公告)日:2023-05-25

    申请号:US17982766

    申请日:2022-11-08

    CPC classification number: G06T15/06 G06F16/9027 G06F7/14 G06F9/3877 G06N3/02

    Abstract: Apparatus and method for acceleration data structure refit. For example, one embodiment of an apparatus comprises: a ray generator to generate a plurality of rays in a first graphics scene; a hierarchical acceleration data structure generator to construct an acceleration data structure comprising a plurality of hierarchically arranged nodes including inner nodes and leaf nodes stored in a memory in a depth-first search (DFS) order; traversal hardware logic to traverse one or more of the rays through the acceleration data structure; intersection hardware logic to determine intersections between the one or more rays and one or more primitives within the hierarchical acceleration data structure; a node refit unit comprising circuitry and/or logic to read consecutively through at least the inner nodes in the memory in reverse DFS order to perform a bottom-up refit operation on the hierarchical acceleration data structure.

    APPARATUS AND METHOD FOR RAY TRACING INSTRUCTION PROCESSING AND EXECUTION

    公开(公告)号:US20210035349A1

    公开(公告)日:2021-02-04

    申请号:US16996208

    申请日:2020-08-18

    Abstract: An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.

Patent Agency Ranking