-
公开(公告)号:US20200349091A1
公开(公告)日:2020-11-05
申请号:US16881271
申请日:2020-05-22
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Joydeep Ray , James A. Valerio , Altug Koker , Prasoonkumar Surti , Balaji Vembu , Wenyin Fu , Bhushan M. Borole , Kamal Sinha
IPC: G06F12/128 , G06F12/0811 , G06F13/40 , G06F12/12 , G06T1/60 , G06F12/0897 , G06F12/084
Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.
-
62.
公开(公告)号:US20200334896A1
公开(公告)日:2020-10-22
申请号:US16865587
申请日:2020-05-04
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Karthik Vaidyanathan , Atsuo Kuwahara , Hugues Labbe , Sameer Kp , Jonathan Kennedy , Abhishek R. Appu , Jeffery S. Boles , Balaji Vembu , Michael Apodaca , Slawomir Grajewski , Gabor Liktor , David M. Cimini , Andrew T. Lauritzen , Travis T. Schluessler , Murali Ramadoss , Abhishek Venkatesh , Joydeep Ray , Kai Xiao , Ankur N. Shah , Altug Koker
Abstract: The systems, apparatuses and methods may provide a way to adaptively process and aggressively cull geometry data. Systems, apparatuses and methods may provide for processing, by a positional only shading pipeline (POSH), geometry data including surface triangles for a digital representation of a scene. More particularly, systems, apparatuses and methods may provide a way to identify surface triangles in one or more exclusion zones and non-exclusion zones, and cull surface triangles surface triangles in one or more exclusion zones.
-
公开(公告)号:US20200293450A1
公开(公告)日:2020-09-17
申请号:US16355015
申请日:2019-03-15
Applicant: Intel Corporation
Inventor: Vikranth Vemulapalli , Lakshminarayanan Striramassarma , Mike MacPherson , Aravindh Anantaraman , Ben Ashbaugh , Murali Ramadoss , William B. Sadler , Jonathan Pearce , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter, JR. , Prasoonkumar Surti , Nicolas Galoppo von Borries , Joydeep Ray , Abhishek R. Appu , ElMoustapha Ould-Ahmed-Vall , Altug Koker , Sungye Kim , Subramaniam Maiyuran , Valentin Andrei
IPC: G06F12/0862 , G06T1/60 , G06T1/20
Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
-
公开(公告)号:US10769818B2
公开(公告)日:2020-09-08
申请号:US15482803
申请日:2017-04-09
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Kiran C. Veernapu , Prasoonkumar Surti , Joydeep Ray , Altug Koker , Eric G. Liskay
IPC: G06T9/00
Abstract: A mechanism is described for facilitating smart compression/decompression schemes at computing devices. A method of embodiments, as described herein, includes unifying a first compression scheme relating to three-dimensional (3D) content and a second compression scheme relating to media content into a unified compression scheme to perform compression of one or more of the 3D content and the media content relating to a processor including a graphics processor.
-
公开(公告)号:US10769072B2
公开(公告)日:2020-09-08
申请号:US16277114
申请日:2019-02-15
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , Prasoonkumar Surti , Kamal Sinha , Kiran C. Veernapu , Balaji Vembu
IPC: G06F12/08 , G06F12/0888 , G06F13/42 , G06F13/40 , G06T1/60 , G06F12/0895 , G06T1/20
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US10762686B2
公开(公告)日:2020-09-01
申请号:US16235906
申请日:2018-12-28
Applicant: Intel Corporation
Inventor: Scott Janus , Prasoonkumar Surti , Karthik Vaidyanathan , Alexey Supikov , Gabor Liktor , Carsten Benthin , Philip Laws , Michael Doyle
Abstract: Apparatus and method for a hierarchical beam tracer. For example, one embodiment of an apparatus comprises: a beam generator to generate beam data associated with a beam projected into a graphics scene; a bounding volume hierarchy (BVH) generator to generate BVH data comprising a plurality of hierarchically arranged BVH nodes; a hierarchical beam-based traversal unit to determine whether the beam intersects a current BVH node and, if so, to responsively subdivide the beam into N child beams to test against the current BVH node and/or to traverse further down the BVH hierarchy to select a new BVH node, wherein the hierarchical beam-based traversal unit is to iteratively subdivide successive intersecting child beams and/or to continue to traverse down the BVH hierarchy until a leaf node is reached with which at least one final child beam is determined to intersect; the hierarchical beam-based traversal unit to generate a plurality of rays within the final child beam; and intersection hardware logic to perform intersection testing for any rays intersecting the leaf node, the intersection testing to determine intersections between the rays intersecting the leaf node and primitives bounded by the leaf node.
-
公开(公告)号:US10748238B2
公开(公告)日:2020-08-18
申请号:US16279270
申请日:2019-02-19
Applicant: Intel Corporation
Inventor: Saurabh Sharma , Abhishek Venkatesh , Travis T. Schluessler , Prasoonkumar Surti , Altug Koker , Aravindh V. Anantaraman , Pattabhiraman P. K. , Abhishek R. Appu , Joydeep Ray , Kamal Sinha , Vasanth Ranganathan , Bhushan M. Borole , Wenyin Fu , Eric J. Hoekstra , Linda L. Hurd
Abstract: A control surface tracks an individual cacheline in the original surface for frequent data values. If so, control surface bits are set. When reading a cacheline from memory, first the control surface bits are read. If they happen to be set, then the original memory read is skipped altogether and instead the bits from the control surface provide the value for the entire cacheline.
-
公开(公告)号:US20200258263A1
公开(公告)日:2020-08-13
申请号:US16750819
申请日:2020-01-23
Applicant: Intel Corporation
Inventor: Joydeep Ray , Ben Ashbaugh , Prasoonkumar Surti , Pradeep Ramani , Rama Harihara , Jerin C. Justin , Jing Huang , Xiaoming Cui , Timothy B. Costa , Ting Gong , Elmoustapha Ould-ahmed-vall , Kumar Balasubramanian , Anil Thomas , Oguz H. Elibol , Jayaram Bobba , Guozhong Zhuang , Bhavani Subramanian , Gokce Keskin , Chandrasekaran Sakthivel , Rajesh Poornachandran
Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
-
公开(公告)号:US10726517B2
公开(公告)日:2020-07-28
申请号:US16293044
申请日:2019-03-05
Applicant: Intel Corporation
Inventor: Altug Koker , Ingo Wald , David Puffer , Subramaniam M. Maiyuran , Prasoonkumar Surti , Balaji Vembu , Guei-Yuan Lueh , Murali Ramadoss , Abhishek R. Appu , Joydeep Ray
Abstract: One embodiment provides for a parallel processor comprising a processing array within the parallel processor, the processing array including multiple compute blocks, each compute block including multiple processing clusters configured for parallel operation, wherein each of the multiple compute blocks is independently preemptable. In one embodiment a preemption hint can be generated for source code during compilation to enable a compute unit to determine an efficient point for preemption.
-
公开(公告)号:US20200210238A1
公开(公告)日:2020-07-02
申请号:US16726341
申请日:2019-12-24
Applicant: Intel Corporation
Inventor: Abhishek R Appu , Altug Koker , Balaji Vembu , Joydeep Ray , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu , Subramaniam Maiyuran , Sanjeev S. Jahagirdar , Eric J. Asperheim , Guei-Yuan Lueh , David Puffer , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Josh B. Mastronarde , Linda L. Hurd , Travis T. Schluessler , Tomasz Janczak , Abhishek Venkatesh , Kai Xiao , Slawomir Grajewski
Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-