-
公开(公告)号:US12112397B2
公开(公告)日:2024-10-08
申请号:US18334733
申请日:2023-06-14
申请人: Intel Corporation
发明人: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
IPC分类号: G06T1/20 , G06F9/30 , G06F9/38 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
CPC分类号: G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
摘要: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.
-
公开(公告)号:US20240127408A1
公开(公告)日:2024-04-18
申请号:US18514252
申请日:2023-11-20
申请人: Intel Corporation
发明人: Anbang Yao , Ming Lu , Yikai Wang , Xiaoming Chen , Junjie Huang , Tao Lv , Yuanke Luo , Yi Yang , Feng Chen , Zhiming Wang , Zhiqiao Zheng , Shandong Wang
CPC分类号: G06T5/002 , G06N3/04 , G06T2207/20081 , G06T2207/20084
摘要: Embodiments are generally directed to an adaptive deformable kernel prediction network for image de-noising. An embodiment of a method for de-noising an image by a convolutional neural network implemented on a compute engine, the image including a plurality of pixels, the method comprising: for each of the plurality of pixels of the image, generating a convolutional kernel having a plurality of kernel values for the pixel; generating a plurality of offsets for the pixel respectively corresponding to the plurality of kernel values, each of the plurality of offsets to indicate a deviation from a pixel position of the pixel; determining a plurality of deviated pixel positions based on the pixel position of the pixel and the plurality of offsets; and filtering the pixel with the convolutional kernel and pixel values of the plurality of deviated pixel positions to obtain a de-noised pixel.
-
公开(公告)号:US11727527B2
公开(公告)日:2023-08-15
申请号:US17541413
申请日:2021-12-03
申请人: Intel Corporation
发明人: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
IPC分类号: G06T1/20 , G06N3/063 , G06F9/38 , G06F9/30 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/08
CPC分类号: G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
摘要: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex compute operation.
-
公开(公告)号:US11693658B2
公开(公告)日:2023-07-04
申请号:US17443376
申请日:2021-07-26
申请人: Intel Corporation
发明人: Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha
CPC分类号: G06F9/3001 , G06F9/3851 , G06F9/3887 , G06F9/3893 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T1/20 , G06F2207/4824
摘要: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a ternary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the ternary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.
-
公开(公告)号:US20230061670A1
公开(公告)日:2023-03-02
申请号:US17978573
申请日:2022-11-01
申请人: Intel Corporation
发明人: Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , Altug Koker , Abhishek R. Appu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Ben J. Ashbaugh , Barath Lakshmanan , Liwei Ma , Joydeep Ray , Ping T. Tang , Michael S. Strickland
IPC分类号: G06T1/20 , G06N3/08 , G06F9/38 , G06N3/063 , G06F9/30 , G06N20/00 , G06N3/04 , G06F9/50 , G06F7/483
摘要: One embodiment provides an apparatus comprising a memory stack including multiple memory dies and a parallel processor including a plurality of multiprocessors. Each multiprocessor has a single instruction, multiple thread (SIMT) architecture, the parallel processor coupled to the memory stack via one or more memory interfaces. At least one multiprocessor comprises a multiply-accumulate circuit to perform multiply-accumulate operations on matrix data in a stage of a neural network implementation to produce a result matrix comprising a plurality of matrix data elements at a first precision, precision tracking logic to evaluate metrics associated with the matrix data elements and indicate if an optimization is to be performed for representing data at a second stage of the neural network implementation, and a numerical transform unit to dynamically perform a numerical transform operation on the matrix data elements based on the indication to produce transformed matrix data elements at a second precision.
-
公开(公告)号:US20230039729A1
公开(公告)日:2023-02-09
申请号:US17963539
申请日:2022-10-11
申请人: Intel Corporation
发明人: Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. MacPherson , John C. Weast , Justin E. Gottschlich , Jingyi Jin , Barath Lakshmanan , Chandrasekaran Sakthivel , Michael S. Strickland , Joydeep Ray , Kamal Sinha , Prasoonkumar Surti , Balaji Vembu , Ping T. Tang , Anbang Yao , Tatiana Shpeisman , Xiaoming Chen
摘要: Methods and apparatus relating to autonomous vehicle neural network optimization techniques are described. In an embodiment, the difference between a first training dataset to be used for a neural network and a second training dataset to be used for the neural network is detected. The second training dataset is authenticated in response to the detection of the difference. The neural network is used to assist in an autonomous vehicle/driving. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20230027203A1
公开(公告)日:2023-01-26
申请号:US17826674
申请日:2022-05-27
申请人: Intel Corporation
发明人: Altug Koker , Farshad Akhbari , Feng Chen , Dukhwan Kim , Narayan Srinivasa , Nadathur Rajagopalan Satish , Liwei Ma , Jeremy Bottleson , Eriko Nurvitadhi , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu
摘要: An integrated circuit (IC) package apparatus is disclosed. The IC package includes one or more processing units and a bridge, mounted below the one or more processing unit, including one or more arithmetic logic units (ALUs) to perform atomic operations.
-
公开(公告)号:US20230017304A1
公开(公告)日:2023-01-19
申请号:US17874876
申请日:2022-07-27
申请人: Intel Corporation
发明人: Rajkishore Barik , Brian T. Lewis , Murali Sundaresan , Jeffrey Jackson , Feng Chen , Xiaoming Chen , Mike Macpherson
摘要: A mechanism is described for facilitating smart distribution of resources for deep learning autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application at a computing device.
-
公开(公告)号:US11488005B2
公开(公告)日:2022-11-01
申请号:US16518828
申请日:2019-07-22
申请人: Intel Corporation
发明人: Brian T. Lewis , Feng Chen , Jeffrey R. Jackson , Justin E. Gottschlich , Rajkishore Barik , Xiaoming Chen , Prasoonkumar Surti , Mike B. Macpherson , Murali Sundaresan
摘要: A mechanism is described for facilitating smart collection of data and smart management of autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and combining a first computation directed to be performed locally at a local computing device with a second computation directed to be performed remotely at a remote computing device in communication with the local computing device over the one or more networks, where the first computation consumes low power, wherein the second computation consumes high power.
-
公开(公告)号:US11468541B2
公开(公告)日:2022-10-11
申请号:US17720804
申请日:2022-04-14
申请人: Intel Corporation
发明人: Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anhang Yao , Kevin Nealis , Xiaoming Chen , Altug Koker , Abhishek R. Appu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Ben J. Ashbaugh , Barath Lakshmanan , Liwei Ma , Joydeep Ray , Ping T. Tang , Michael S. Strickland
IPC分类号: G06T1/20 , G06F7/483 , G06N20/00 , G06F3/14 , G06T1/60 , G06N3/08 , G06F9/30 , G06N3/04 , G06N3/063 , G06F9/50 , G06F9/38 , G06T15/00
摘要: Embodiments described herein provide a graphics processor that can perform a variety of mixed and multiple precision instructions and operations. One embodiment provides a streaming multiprocessor that can concurrently execute multiple thread groups, wherein the streaming multiprocessor includes a single instruction, multiple thread (SIMT) architecture and the streaming multiprocessor is to execute multiple threads for each of multiple instructions. The streaming multiprocessor can perform concurrent integer and floating-point operations and includes a mixed precision core to perform operations at multiple or mixed precisions and dynamic ranges.
-
-
-
-
-
-
-
-
-