-
公开(公告)号:US20250045582A1
公开(公告)日:2025-02-06
申请号:US18804720
申请日:2024-08-14
Applicant: Intel Corporation
Inventor: Anbang Yao , Yiwen Guo , Yan Li , Yurong Chen
Abstract: Techniques related to compressing a pre-trained dense deep neural network to a sparsely connected deep neural network for efficient implementation are discussed. Such techniques may include iteratively pruning and splicing available connections between adjacent layers of the deep neural network and updating weights corresponding to both currently disconnected and currently connected connections between the adjacent layers.
-
公开(公告)号:US12112397B2
公开(公告)日:2024-10-08
申请号:US18334733
申请日:2023-06-14
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
CPC classification number: G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.
-
公开(公告)号:US12112256B2
公开(公告)日:2024-10-08
申请号:US16982441
申请日:2018-07-26
Applicant: Intel Corporation , Anbang Yao , Aojun Zhou , Kuan Wang , Hao Zhao , Yurong Chen
Inventor: Anbang Yao , Aojun Zhou , Kuan Wang , Hao Zhao , Yurong Chen
IPC: G06N3/063 , G06F18/21 , G06F18/214 , G06N3/047 , G06N3/084
CPC classification number: G06N3/063 , G06F18/2148 , G06F18/217 , G06N3/047 , G06N3/084
Abstract: Methods, apparatus, systems and articles of manufacture for loss-error-aware quantization of a low-bit neural network are disclosed. An example apparatus includes a network weight partitioner to partition unquantized network weights of a first network model into a first group to be quantized and a second group to be retrained. The example apparatus includes a loss calculator to process network weights to calculate a first loss. The example apparatus includes a weight quantizer to quantize the first group of network weights to generate low-bit second network weights. In the example apparatus, the loss calculator is to determine a difference between the first loss and a second loss. The example apparatus includes a weight updater to update the second group of network weights based on the difference. The example apparatus includes a network model deployer to deploy a low-bit network model including the low-bit second network weights.
-
公开(公告)号:US20240296668A1
公开(公告)日:2024-09-05
申请号:US18572510
申请日:2021-09-10
Applicant: Intel Corporation
Inventor: Dongqi Cai , Yurong Chen , Anbang Yao
CPC classification number: G06V10/82 , G06V10/955
Abstract: Technology to conduct image sequence/video analysis can include a processor, and a memory coupled to the processor, the memory storing a neural network, the neural network comprising a plurality of convolution layers, and a plurality of normalization layers arranged as a relay structure, wherein each normalization layer is coupled to and following a respective one of the plurality of convolution layers. The plurality of normalization layers can be arranged as a relay structure where a normalization layer for a layer (k) is coupled to and following a normalization layer for a preceding layer (k−1). The normalization layer for the layer (k) is coupled to the normalization layer for the preceding layer (k−1) via a hidden state signal and a cell state signal, each signal generated by the normalization layer for the preceding layer (k−1). Each normalization layer (k) can include a meta-gating unit (MGU) structure.
-
公开(公告)号:US20240127408A1
公开(公告)日:2024-04-18
申请号:US18514252
申请日:2023-11-20
Applicant: Intel Corporation
Inventor: Anbang Yao , Ming Lu , Yikai Wang , Xiaoming Chen , Junjie Huang , Tao Lv , Yuanke Luo , Yi Yang , Feng Chen , Zhiming Wang , Zhiqiao Zheng , Shandong Wang
CPC classification number: G06T5/002 , G06N3/04 , G06T2207/20081 , G06T2207/20084
Abstract: Embodiments are generally directed to an adaptive deformable kernel prediction network for image de-noising. An embodiment of a method for de-noising an image by a convolutional neural network implemented on a compute engine, the image including a plurality of pixels, the method comprising: for each of the plurality of pixels of the image, generating a convolutional kernel having a plurality of kernel values for the pixel; generating a plurality of offsets for the pixel respectively corresponding to the plurality of kernel values, each of the plurality of offsets to indicate a deviation from a pixel position of the pixel; determining a plurality of deviated pixel positions based on the pixel position of the pixel and the plurality of offsets; and filtering the pixel with the convolutional kernel and pixel values of the plurality of deviated pixel positions to obtain a de-noised pixel.
-
6.
公开(公告)号:US11907843B2
公开(公告)日:2024-02-20
申请号:US16305626
申请日:2016-06-30
Applicant: Intel Corporation
Inventor: Anbang Yao , Yiwen Guo , Yurong Chen
IPC: G06N3/082 , G06F18/241 , G06V10/764 , G06V10/82
CPC classification number: G06N3/082 , G06F18/241 , G06V10/764 , G06V10/82
Abstract: Systems, apparatuses and methods may provide for conducting an importance measurement of a plurality of parameters in a trained neural network and setting a subset of the plurality of parameters to zero based on the importance measurement. Additionally, the pruned neural network may be re-trained. In one example, conducting the importance measurement includes comparing two or more parameter values that contain covariance matrix information.
-
公开(公告)号:US11887001B2
公开(公告)日:2024-01-30
申请号:US16328182
申请日:2016-09-26
Applicant: Intel Corporation
Inventor: Anbang Yao , Yiwen Guo , Lin Xu , Yan Lin , Yurong Chen
CPC classification number: G06N3/082 , G06F17/16 , G06N3/02 , G06N3/04 , G06N3/045 , G06N3/084 , G06N3/044
Abstract: An apparatus and method are described for reducing the parameter density of a deep neural network (DNN). A layer-wise pruning module to prune a specified set of parameters from each layer of a reference dense neural network model to generate a second neural network model having a relatively higher sparsity rate than the reference neural network model; a retraining module to retrain the second neural network model in accordance with a set of training data to generate a retrained second neural network model; and the retraining module to output the retrained second neural network model as a final neural network model if a target sparsity rate has been reached or to provide the retrained second neural network model to the layer-wise pruning model for additional pruning if the target sparsity rate has not been reached.
-
公开(公告)号:US11790631B2
公开(公告)日:2023-10-17
申请号:US17408094
申请日:2021-08-20
Applicant: Intel Corporation
Inventor: Anbang Yao , Yun Ren , Hao Zhao , Tao Kong , Yurong Chen
IPC: G06V10/00 , G06V10/44 , G06N3/04 , G06N3/08 , G06V30/24 , G06F18/243 , G06V30/19 , G06V10/82 , G06V20/70 , G06V20/10
CPC classification number: G06V10/454 , G06F18/24317 , G06N3/04 , G06N3/08 , G06V10/82 , G06V20/10 , G06V20/70 , G06V30/19173 , G06V30/2504
Abstract: An example apparatus for mining multi-scale hard examples includes a convolutional neural network to receive a mini-batch of sample candidates and generate basic feature maps. The apparatus also includes a feature extractor and combiner to generate concatenated feature maps based on the basic feature maps and extract the concatenated feature maps for each of a plurality of received candidate boxes. The apparatus further includes a sample scorer and miner to score the candidate samples with multi-task loss scores and select candidate samples with multi-task loss scores exceeding a threshold score.
-
公开(公告)号:US11727527B2
公开(公告)日:2023-08-15
申请号:US17541413
申请日:2021-12-03
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
IPC: G06T1/20 , G06N3/063 , G06F9/38 , G06F9/30 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/08
CPC classification number: G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084
Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex compute operation.
-
公开(公告)号:US11693658B2
公开(公告)日:2023-07-04
申请号:US17443376
申请日:2021-07-26
Applicant: Intel Corporation
Inventor: Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha
CPC classification number: G06F9/3001 , G06F9/3851 , G06F9/3887 , G06F9/3893 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T1/20 , G06F2207/4824
Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a ternary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the ternary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.
-
-
-
-
-
-
-
-
-