-
公开(公告)号:US20240345990A1
公开(公告)日:2024-10-17
申请号:US18626775
申请日:2024-04-04
Applicant: Intel Corporation
Inventor: Lakshminarayanan Striramassarma , Prasoonkumar Surti , Varghese George , Ben Ashbaugh , Aravindh Anantaraman , Valentin Andrei , Abhishek Appu , Nicolas Galoppo Von Borries , Altug Koker , Mike Macpherson , Subramaniam Maiyuran , Nilay Mistry , Elmoustapha Ould-Ahmed-Vall , Selvakumar Panneer , Vasanth Ranganathan , Joydeep Ray , Ankur Shah , Saurabh Tangri
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06N3/08 , G06T1/20 , G06T1/60 , G06T15/06 , H03M7/46
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06
Abstract: Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi-GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.
-
公开(公告)号:US12079155B2
公开(公告)日:2024-09-03
申请号:US17428216
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Joydeep Ray , Selvakumar Panneer , Saurabh Tangri , Ben Ashbaugh , Scott Janus , Abhishek Appu , Varghese George , Ravishankar Iyer , Nilesh Jain , Pattabhiraman K , Altug Koker , Mike MacPherson , Josh Mastronarde , Elmoustapha Ould-Ahmed-Vall , Jayakrishna P. S , Eric Samson
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06
Abstract: Embodiments described herein include software, firmware, and hardware that provides techniques to enable deterministic scheduling across multiple general-purpose graphics processing units. One embodiment provides a multi-GPU architecture with uniform latency. One embodiment provides techniques to distribute memory output based on memory chip thermals. One embodiment provides techniques to enable thermally aware workload scheduling. One embodiment provides techniques to enable end to end contracts for workload scheduling on multiple GPUs.
-
公开(公告)号:US20240273029A1
公开(公告)日:2024-08-15
申请号:US18570314
申请日:2023-01-09
Applicant: HONOR DEVICE CO., LTD.
Inventor: Jirun XU
IPC: G06F12/0871 , G06F12/02 , H04N23/63
CPC classification number: G06F12/0871 , G06F12/0253 , H04N23/632 , G06F2212/302 , G06F2212/455 , G06F2212/7205
Abstract: Embodiments of this application provide a photographing method and related apparatus, which are applied to terminal technologies. The method includes: when the terminal device displays the photo previewing interface, frames are previewed in a cache queue; receiving and responding to the photo-taking operation in the previewing interface, The image from the cache queue is managed in undeletable state; After completing the algorithm processing based on the selected image, the selected image is deleted; The terminal device generates a photo based on the processed image. In this way, the selected image in the cache queue is managed undeletably, so that the selected image is not cleared when the terminal device generates the picture. Then, the cache queue may reserve the selected image for a long time, and the terminal device does not need to copy and store the selected image. Therefore, large memory occupation caused by copy is reduced, and save power.
-
公开(公告)号:US11709793B2
公开(公告)日:2023-07-25
申请号:US17827067
申请日:2022-05-27
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Shubra Marwaha , Ashutosh Garg , Supratim Pal , Jorge Parra , Chandra Gurram , Varghese George , Darin Starkey , Guei-Yuan Lueh
IPC: G06T15/06 , G06F9/30 , G06F15/78 , G06F9/38 , G06F17/18 , G06F12/0802 , G06F7/544 , G06F7/575 , G06F12/02 , G06F12/0866 , G06F12/0875 , G06F12/0895 , G06F12/128 , G06F12/06 , G06F12/1009 , G06T1/20 , G06T1/60 , H03M7/46 , G06F12/0811 , G06F15/80 , G06F17/16 , G06F7/58 , G06F12/0871 , G06F12/0862 , G06F12/0897 , G06F9/50 , G06F12/0804 , G06F12/0882 , G06F12/0891 , G06F12/0893 , G06F12/0888 , G06N3/08
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/3004 , G06F9/30014 , G06F9/30036 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06
Abstract: Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.
-
公开(公告)号:US20230206384A1
公开(公告)日:2023-06-29
申请号:US17563950
申请日:2021-12-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Priyadarshi Sharma , Anshuman Mittal , Saurabh Sharma
IPC: G06T1/60 , G06F12/0891 , G06T1/20
CPC classification number: G06T1/60 , G06F12/0891 , G06T1/20 , G06F2212/455
Abstract: Systems, apparatuses, and methods for performing dead surface invalidation are disclosed. An application sends draw call commands to a graphics processing unit (GPU) via a driver, with the draw call commands rendering to surfaces. After it is determined that a given surface will no longer be accessed by subsequent draw calls, the application sends a surface invalidation command for the given surface to a command processor of the GPU. After the command processor receives the surface invalidation command, the command processor waits for a shader engine to send a draw call completion message for a last draw call to access the given surface. Once the command processor receives the draw call completion message, the command processor sends a surface invalidation command to a cache to invalidate cache lines for the given surface to free up space in the cache for other data.
-
公开(公告)号:US20190251032A1
公开(公告)日:2019-08-15
申请号:US16266997
申请日:2019-02-04
Applicant: Linear Algebra Technologies Limited
Inventor: Richard Richmond
IPC: G06F12/0884 , G06F12/0895 , G06F12/0875 , G06F12/0842 , G06F12/0804 , G06F12/0811
CPC classification number: G06F12/0884 , G06F12/0804 , G06F12/0811 , G06F12/0842 , G06F12/0875 , G06F12/0895 , G06F2212/283 , G06F2212/455
Abstract: Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.
-
公开(公告)号:US20180285117A1
公开(公告)日:2018-10-04
申请号:US15477020
申请日:2017-04-01
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Abhishek R. Appu , Joydeep Ray , Subramaniam M. Maiyuran , Altug Koker
IPC: G06F9/38 , G09G5/393 , G06F9/50 , G06T1/20 , G06F9/30 , G06F12/0875 , H04N19/436 , H04N19/423
CPC classification number: G06F9/3851 , G06F8/453 , G06F9/3004 , G06F9/5027 , G06F12/0607 , G06F12/0875 , G06F2212/1016 , G06F2212/302 , G06F2212/455 , G06T1/20 , G06T1/60 , G06T15/00 , G09G5/393 , G09G2360/122 , H04N19/423 , H04N19/436
Abstract: An apparatus to facilitate memory tiling is disclosed. The apparatus includes a memory, one or more execution units (EUs) to execute a plurality of processing threads via access to the memory and tiling logic to apply a tiling pattern to memory addresses for data stored in the memory.
-
公开(公告)号:US10055878B2
公开(公告)日:2018-08-21
申请号:US15252436
申请日:2016-08-31
Applicant: Siemens Healthcare GmbH
Inventor: Klaus Engel , Jana Martschinke
IPC: G06T15/06 , G06T15/08 , G06F12/0897 , G06F12/0875
CPC classification number: G06T15/06 , G06F12/0875 , G06F12/0897 , G06F2212/455 , G06T15/08 , G06T2207/10081 , G06T2207/10088 , G06T2207/10104 , G06T2207/10108 , G06T2207/10132 , G06T2207/20024
Abstract: A method of visualizing a three-dimensional object from a data volume is disclosed. In an embodiment, the method includes computing an irradiance cache for the data volume; and applying the irradiance cache during rendering of a three-dimensional image from the data volume. In an embodiment, entries of the irradiance cache are organized in a uniform grid.
-
公开(公告)号:US10019349B2
公开(公告)日:2018-07-10
申请号:US14715683
申请日:2015-05-19
Applicant: Samsung Electronics Co., Ltd.
Inventor: Seong Hoon Jeong , Woong Seo , Sang Heon Lee , Sun Min Kwon , Ho Young Kim , Hee Jun Shim
IPC: G06F12/00 , G06F12/02 , G06F12/06 , G06F12/0846
CPC classification number: G06F12/0207 , G06F12/0607 , G06F12/0846 , G06F2212/1016 , G06F2212/1056 , G06F2212/455
Abstract: A cache memory and a method of managing the same are provided. The method of managing a cache memory includes determining whether a number of bits of a data bandwidth stored in a bank is an integer multiple of a number of bits of unit data in data to be stored, storing first unit data, among the data to be stored, in a first region of a first address in the bank in response to the number of bits of the data bandwidth not being the integer multiple of the number of bits of the unit data, and storing part of second unit data, among the data to be stored, in a second region of the first address.
-
公开(公告)号:US10007543B2
公开(公告)日:2018-06-26
申请号:US14309794
申请日:2014-06-19
Applicant: VMware, Inc.
Inventor: Rishi Bidarkar , Hari Sivaraman , Banit Agrawal
IPC: G06F9/455 , G06F12/0802
CPC classification number: G06F9/45558 , G06F12/0802 , G06F12/0875 , G06F2009/45583 , G06F2212/401 , G06F2212/455 , G06T2200/28
Abstract: Exemplary methods, apparatuses, and systems receive a first instruction set from a first virtual machine (VM), the first instruction set including a request to perform an operation on an input. A first identifier is generated based upon the operation and the input. The first identifier is mapped to a stored copy of the input, the operation, and an output resulting from a processor performing the operation. In response to receiving a second instruction set from a second VM, a second identifier is generated based upon the input and operation received within the second instruction set. In response to determining that the second identifier matches the stored first identifier, it is further determined that the input and operation of the first instruction set matches the input and operation of the second instruction set. A copy of the stored output is returned to the second VM.
-
-
-
-
-
-
-
-
-