FULLY-FUSED NEURAL NETWORK EXECUTION

    公开(公告)号:US20220284658A1

    公开(公告)日:2022-09-08

    申请号:US17340283

    申请日:2021-06-07

    IPC分类号: G06T15/06 G06N3/10 G06T15/00

    摘要: A fully-connected neural network may be configured for execution by a processor as a fully-fused neural network by limiting slow global memory accesses to reading and writing inputs to and outputs from the fully-connected neural network. The computational cost of fully-connected neural networks scale quadratically with its width, whereas its memory traffic scales linearly. Modern graphics processing units typically have much greater computational throughput compared with memory bandwidth, so that for narrow, fully-connected neural networks, the linear memory traffic is the bottleneck. The key to improving performance of the fully-connected neural network is to minimize traffic to slow “global” memory (off-chip memory and high-level caches) and to fully utilize fast on-chip memory (low-level caches, “shared” memory, and registers), which is achieved by the fully-fused approach. A real-time neural radiance caching technique for path-traced global illumination is implemented using the fully-fused neural network for caching scattered radiance components of global illumination.

    TABLE DICTIONARIES FOR COMPRESSING NEURAL GRAPHICS PRIMITIVES

    公开(公告)号:US20230360278A1

    公开(公告)日:2023-11-09

    申请号:US18298852

    申请日:2023-04-11

    IPC分类号: G06T9/00 G06T3/40

    摘要: Neural network performance is improved in terms of training speed, memory footprint, and/or accuracy by learning a compressed neural graphics primitive representation. A neural graphics primitive is a mathematical function involving at least one neural network, used to represent a computer graphic, where the graphic can be an image, a 3D shape, a light field, a signed distance function, a radiance field, 2D video, volumetric video, etc. Instead of being input directly to a neural network, inputs are effectively mapped (encoded) into a higher dimensional space via a function. The input comprises coordinates used to identify a point within a d-dimensional space. The point is quantized and a set of vertex coordinates corresponding to the point are used to access an indexing codebook and a features codebook that store learned index offsets and learned feature vectors, respectively. The learned feature vectors are then provided as inputs to the neural network.

    REAL-TIME NEURAL NETWORK RADIANCE CACHING FOR PATH TRACING

    公开(公告)号:US20220284657A1

    公开(公告)日:2022-09-08

    申请号:US17340222

    申请日:2021-06-07

    摘要: A real-time neural radiance caching technique for path-traced global illumination is implemented using a neural network for caching scattered radiance components of global illumination. The neural (network) radiance cache handles fully dynamic scenes, and makes no assumptions about the camera, lighting, geometry, and materials. In contrast with conventional caching, the data-driven approach sidesteps many difficulties of caching algorithms, such as locating, interpolating, and updating cache points. The neural radiance cache is trained via online learning during rendering. Advantages of the neural radiance cache are noise reduction and real-time performance. Importantly, the runtime overhead and memory footprint of the neural radiance cache are stable and independent of scene complexity.

    Fully-fused neural network execution

    公开(公告)号:US11631210B2

    公开(公告)日:2023-04-18

    申请号:US17340283

    申请日:2021-06-07

    IPC分类号: G06T15/06 G06N3/10 G06T15/00

    摘要: A fully-connected neural network may be configured for execution by a processor as a fully-fused neural network by limiting slow global memory accesses to reading and writing inputs to and outputs from the fully-connected neural network. The computational cost of fully-connected neural networks scale quadratically with its width, whereas its memory traffic scales linearly. Modern graphics processing units typically have much greater computational throughput compared with memory bandwidth, so that for narrow, fully-connected neural networks, the linear memory traffic is the bottleneck. The key to improving performance of the fully-connected neural network is to minimize traffic to slow “global” memory (off-chip memory and high-level caches) and to fully utilize fast on-chip memory (low-level caches, “shared” memory, and registers), which is achieved by the fully-fused approach. A real-time neural radiance caching technique for path-traced global illumination is implemented using the fully-fused neural network for caching scattered radiance components of global illumination.

    Real-time neural network radiance caching for path tracing

    公开(公告)号:US11610360B2

    公开(公告)日:2023-03-21

    申请号:US17340222

    申请日:2021-06-07

    IPC分类号: G06T15/06 G06T15/50

    摘要: A real-time neural radiance caching technique for path-traced global illumination is implemented using a neural network for caching scattered radiance components of global illumination. The neural (network) radiance cache handles fully dynamic scenes, and makes no assumptions about the camera, lighting, geometry, and materials. In contrast with conventional caching, the data-driven approach sidesteps many difficulties of caching algorithms, such as locating, interpolating, and updating cache points. The neural radiance cache is trained via online learning during rendering. Advantages of the neural radiance cache are noise reduction and real-time performance. Importantly, the runtime overhead and memory footprint of the neural radiance cache are stable and independent of scene complexity.

    Fully-fused neural network execution

    公开(公告)号:US11935179B2

    公开(公告)日:2024-03-19

    申请号:US18184519

    申请日:2023-03-15

    摘要: A fully-connected neural network may be configured for execution by a processor as a fully-fused neural network by limiting slow global memory accesses to reading and writing inputs to and outputs from the fully-connected neural network. The computational cost of fully-connected neural networks scale quadratically with its width, whereas its memory traffic scales linearly. Modern graphics processing units typically have much greater computational throughput compared with memory bandwidth, so that for narrow, fully-connected neural networks, the linear memory traffic is the bottleneck. The key to improving performance of the fully-connected neural network is to minimize traffic to slow “global” memory (off-chip memory and high-level caches) and to fully utilize fast on-chip memory (low-level caches, “shared” memory, and registers), which is achieved by the fully-fused approach. A real-time neural radiance caching technique for path-traced global illumination is implemented using the fully-fused neural network for caching scattered radiance components of global illumination.

    NEURAL NETWORK CONTROL VARIATES
    9.
    发明公开

    公开(公告)号:US20240020443A1

    公开(公告)日:2024-01-18

    申请号:US18478025

    申请日:2023-09-29

    摘要: Monte Carlo and quasi-Monte Carlo integration are simple numerical recipes for solving complicated integration problems, such as valuating financial derivatives or synthesizing photorealistic images by light transport simulation. A drawback of a straightforward application of (quasi-)Monte Carlo integration is the relatively slow convergence rate that manifests as high error of Monte Carlo estimators. Neural control variates may be used to reduce error in parametric (quasi-)Monte Carlo integration—providing more accurate solutions in less time. A neural network system has sufficient approximation power for estimating integrals and is efficient to evaluate. The efficiency results from the use of a first neural network that infers the integral of the control variate and using normalizing flows to model a shape of the control variate.

    RENDERING ALONG A SPACE-FILLING CURVE
    10.
    发明公开

    公开(公告)号:US20230419450A1

    公开(公告)日:2023-12-28

    申请号:US18076954

    申请日:2022-12-07

    摘要: In photorealistic image synthesis by light transport simulation, the colors of each pixel are an integral of a high-dimensional function. However, the functions to integrate contain discontinuities that cannot be predicted efficiently. In practice, the pixel colors are estimated by using Monte Carlo and quasi-Monte Carlo methods to sample light transport paths that connect light sources and cameras and summing up the contributions to evaluate an integral. Because of the sampling, images appear noisy when the number of samples is insufficient. A low discrepancy sequence provides sample locations and these sample locations can be enumerated (assigned or distributed to pixels) according to a space-filling curve superimposed on a pixel grid. Correlations of such combinations of space-filling curves and low discrepancy sequences are analyzed, and the presented algorithms reduce correlations, are deterministic, and may be executed for each pixel in parallel.