-
公开(公告)号:US20250053452A1
公开(公告)日:2025-02-13
申请号:US18774583
申请日:2024-07-16
Applicant: Intel Corporation
Inventor: Pawel MAJEWSKI , Prasoonkumar SURTI , Karthik VAIDYANATHAN , Joshua BARCZAK , Vasanth RANGANATHAN , Vikranth VEMULAPALLI
Abstract: Apparatus and method for stack access throttling for synchronous ray tracing. For example, one embodiment of an apparatus comprises: ray tracing acceleration hardware to manage active ray tracing stack allocations to ensure that a size of the active ray tracing stack allocations remains within a threshold; and an execution unit to execute a thread to explicitly request a new ray tracing stack allocation from the ray tracing acceleration hardware, the ray tracing acceleration hardware to permit the new ray tracing stack allocation if the size of the active ray tracing stack allocations will remain within the threshold after permitting the new ray tracing stack allocation.
-
公开(公告)号:US20230297513A1
公开(公告)日:2023-09-21
申请号:US17699062
申请日:2022-03-18
Applicant: Intel Corporation
Inventor: Prasoonkumar SURTI , Tobias ZIRR , Abhishek R. APPU , Anton KAPLANYAN , Pawel MAJEWSKI , Joshua BARCZAK
IPC: G06F12/0897 , G06N20/00
CPC classification number: G06F12/0897 , G06N20/00 , G06F2212/60
Abstract: A cache streaming apparatus and method for machine learning. For example, one embodiment of an apparatus comprises: a plurality of compute units to perform machine learning operations; a cache subsystem comprising a hierarchy of cache levels, at least some of the cache levels shared by two or more of the plurality of compute units; and data streaming hardware logic to stream machine learning data in and out of the cache subsystem based on the machine learning operations, the data streaming hardware logic to load data into the cache subsystem from memory before the data is needed by a first portion of the machine learning operations and to ensure that results produced by the first portion of machine learning operations are maintained in the cache subsystem until used by a second portion of the machine learning operations.
-
公开(公告)号:US20200211265A1
公开(公告)日:2020-07-02
申请号:US16236218
申请日:2018-12-28
Applicant: Intel Corporation
Inventor: Carson BROWNLEE , Joshua BARCZAK , Kai XIAO , Michael APODACA , Philip LAWS , Thomas RAOUX , Travis SCHLUESSLER
Abstract: Cloud-based real time rendering. For example, one embodiment of a system comprises: a first graphics processing node to perform a first set of graphics processing operations to render a graphics scene, the first set of graphics processing operations comprising ray-tracing independent operations; an interconnect or network interface coupling the first graphics processing node to a second graphics processing node; the second graphics processing node to receive an indication of a current view of a user of the first graphics processing node and to receive or construct a view-independent surface generated by view-independent ray traversal and intersection operations; the second graphics processing node to responsively perform a view-dependent translation of the view-independent surface based on the current view of the user to generate a view-dependent surface and to provide the view-dependent surface to the first graphics processing node; and the first graphics processing node to perform a second set of graphics processing operations to complete rendering of the graphics scene using the view-dependent surface.
-
公开(公告)号:US20230137438A1
公开(公告)日:2023-05-04
申请号:US18090810
申请日:2022-12-29
Applicant: INTEL CORPORATION
Inventor: Karthik VAIDYANATHAN , Michael APODACA , Thomas RAOUX , Carsten BENTHIN , Kai XIAO , Carson BROWNLEE , Joshua BARCZAK
Abstract: An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.
-
公开(公告)号:US20210012553A1
公开(公告)日:2021-01-14
申请号:US17032964
申请日:2020-09-25
Applicant: INTEL CORPORATION
Inventor: Michael APODACA , Carsten BENTHIN , Kai XIAO , Carson BROWNLEE , Timothy ROWLEY , Joshua BARCZAK , Travis SCHLUESSLER
IPC: G06T15/06 , G06F16/901 , G06F7/14 , G06F9/38
Abstract: Apparatus and method for acceleration data structure refit. For example, one embodiment of an apparatus comprises: a ray generator to generate a plurality of rays in a first graphics scene; a hierarchical acceleration data structure generator to construct an acceleration data structure comprising a plurality of hierarchically arranged nodes including inner nodes and leaf nodes stored in a memory in a depth-first search (DFS) order; traversal hardware logic to traverse one or more of the rays through the acceleration data structure; intersection hardware logic to determine intersections between the one or more rays and one or more primitives within the hierarchical acceleration data structure; a node refit unit comprising circuitry and/or logic to read consecutively through at least the inner nodes in the memory in reverse DFS order to perform a bottom-up refit operation on the hierarchical acceleration data structure.
-
公开(公告)号:US20230162428A1
公开(公告)日:2023-05-25
申请号:US17982766
申请日:2022-11-08
Applicant: INTEL CORPORATION
Inventor: Michael APODACA , Carsten BENTHIN , Kai XIAO , Carson BROWNLEE , Timothy ROWLEY , Joshua BARCZAK , Travis SCHLUESSLER
IPC: G06T15/06 , G06F16/901 , G06F7/14 , G06F9/38
CPC classification number: G06T15/06 , G06F16/9027 , G06F7/14 , G06F9/3877 , G06N3/02
Abstract: Apparatus and method for acceleration data structure refit. For example, one embodiment of an apparatus comprises: a ray generator to generate a plurality of rays in a first graphics scene; a hierarchical acceleration data structure generator to construct an acceleration data structure comprising a plurality of hierarchically arranged nodes including inner nodes and leaf nodes stored in a memory in a depth-first search (DFS) order; traversal hardware logic to traverse one or more of the rays through the acceleration data structure; intersection hardware logic to determine intersections between the one or more rays and one or more primitives within the hierarchical acceleration data structure; a node refit unit comprising circuitry and/or logic to read consecutively through at least the inner nodes in the memory in reverse DFS order to perform a bottom-up refit operation on the hierarchical acceleration data structure.
-
公开(公告)号:US20210035349A1
公开(公告)日:2021-02-04
申请号:US16996208
申请日:2020-08-18
Applicant: INTEL CORPORATION
Inventor: Karthik VAIDYANATHAN , Michael APODACA , Thomas RAOUX , Carsten BENTHIN , Kai XIAO , Carson BROWNLEE , Joshua BARCZAK
Abstract: An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.
-
公开(公告)号:US20230298255A1
公开(公告)日:2023-09-21
申请号:US17699064
申请日:2022-03-18
Applicant: Intel Corporation
Inventor: Carsten BENTHIN , Radoslaw DRABINSKI , Joshua BARCZAK , Sven WOOP , Holger H. GRUEN , Pawel MAJEWSKI
CPC classification number: G06T15/06 , G06T15/005 , G06T17/005 , G06T7/70
Abstract: Apparatus and method for camera-aware BVH re-braiding. For example, one embodiment of an apparatus comprises: ray tracing acceleration hardware to be used to determine ray traversal results when traversing a ray through a bounding volume hierarchy (BVH); and BVH processing hardware logic to modify the BVH to reduce spatial overlap between one or more BVH subtrees based on a detected camera position to produce a modified BVH.
-
公开(公告)号:US20220343554A1
公开(公告)日:2022-10-27
申请号:US17740754
申请日:2022-05-10
Applicant: INTEL CORPORATION
Inventor: Carson BROWNLEE , Carsten BENTHIN , Joshua BARCZAK , Kai XIAO , Michael APODACA , Prasoonkumar SURTI , Thomas RAOUX
Abstract: Apparatus and method for context-aware compression. For example, one embodiment of an apparatus comprises: ray traversal/intersection circuitry to traverse rays through a hierarchical acceleration data structure to identify intersections between rays and primitives of a graphics scene; matrix compression circuitry/logic to compress hierarchical transformation matrices to generate compressed hierarchical transformation matrices by quantizing N-bit floating point data elements associated with child transforms of the hierarchical transformation matrices to variable-bit floating point numbers or integers comprising offsets from a parent transform of the child transform; and an instance processor to generate a plurality of instances of one or more base geometric objects in accordance with the compressed hierarchical transformation matrices.
-
公开(公告)号:US20210082154A1
公开(公告)日:2021-03-18
申请号:US17003040
申请日:2020-08-26
Applicant: INTEL CORPORATION
Inventor: Carson BROWNLEE , Carsten BENTHIN , Joshua BARCZAK , Kai XIAO , Michael APODACA , Prasoonkumar SURTI , Thomas RAOUX
Abstract: Apparatus and method for context-aware compression. For example, one embodiment of an apparatus comprises: ray traversal/intersection circuitry to traverse rays through a hierarchical acceleration data structure to identify intersections between rays and primitives of a graphics scene; matrix compression circuitry/logic to compress hierarchical transformation matrices to generate compressed hierarchical transformation matrices by quantizing N-bit floating point data elements associated with child transforms of the hierarchical transformation matrices to variable-bit floating point numbers or integers comprising offsets from a parent transform of the child transform; and an instance processor to generate a plurality of instances of one or more base geometric objects in accordance with the compressed hierarchical transformation matrices.
-
-
-
-
-
-
-
-
-