-
公开(公告)号:US12141228B2
公开(公告)日:2024-11-12
申请号:US17017600
申请日:2020-09-10
Inventor: Xiaozhang Gong , Jian Ouyang , Jing Wang , Wei Qi
Abstract: Embodiments of the present disclosure propose a deep learning processing apparatus and method, device and storage medium, relating to the field of artificial intelligence. A deep learning processing apparatus includes: at least one matrix multiply-add module, configured to perform a matrix multiply-add operation of a convolution kernel parameter value matrix of a convolutional layer in a convolutional neural network and a first error gradient value matrix to obtain a plurality of intermediate matrices; a storage apparatus, configured to store the plurality of intermediate matrices without reshaping elements in the plurality of intermediate matrices; and a plurality of matrix accumulation modules, configured to read the plurality of intermediate matrices from the storage apparatus and perform a matrix accumulation operation based on the plurality of intermediate matrices according to a convolution scheme of the convolutional layer in parallel, to obtain a second error gradient value matrix for the convolutional layer.
-
公开(公告)号:US11275683B2
公开(公告)日:2022-03-15
申请号:US16942434
申请日:2020-07-29
Inventor: Xianglun Leng , Yong Wang , Wei Qi , Zhengze Qiu , Yang Yan
IPC: G06F12/02 , G06F7/72 , G06F12/1045
Abstract: Example embodiments of the present disclosure provide a method, an apparatus, a device and a computer-readable storage medium for storage management. The method for storage management includes: obtaining an available channel mode of a plurality of channels in a memory of a data processing system, the available channel mode indicating availabilities of the plurality of channels, and each of the plurality of channels being associated with a set of addresses in the memory; obtaining a channel data-granularity of the plurality of channels, the channel data-granularity indicating a size of a data block that can be carried on each channel; obtaining a target address of data to be transmitted in the memory; and determining a translated address corresponding to the target address based on the available channel mode and the channel data-granularity.
-
公开(公告)号:US10951595B2
公开(公告)日:2021-03-16
申请号:US15618655
申请日:2017-06-09
Inventor: Wei Qi , Jian Ouyang , Yong Wang , Yichen Tu , Sijie Yang
Abstract: The present application discloses a method, system and apparatus for storing a website private key plaintext. A specific implementation of the method includes: receiving a public key sent from a terminal configured to perform encryption and decryption, wherein the public key is generated at random by the terminal; encrypting a website private key plaintext by using the public key to generate a website private key ciphertext, wherein the website private key plaintext is pre-acquired; and sending the website private key ciphertext to the terminal, so that the terminal decrypts the website private key ciphertext by using the private key to generate the website private key plaintext and store the website private key plaintext in the terminal. This implementation improves the security of storage of the website private key plaintext.
-
公开(公告)号:US20190164254A1
公开(公告)日:2019-05-30
申请号:US16265566
申请日:2019-02-01
Inventor: Yichen Tu , Jian Ouyang , Wei Qi , Yong Wang
Abstract: A processor and method for scaling an image are disclosed. A specific embodiment of the processor includes: an off-chip memory, a communication circuit, a control circuit, and an array processor, wherein: the off-chip memory is configured for storing a to-be-scaled original image; the communication circuit is configured for receiving an image scaling instruction; the control circuit is configured for executing the image scaling instruction, and sending a calculation control signal to the array processor; and the array processor is configured for calculating in parallel channel values of N channels in a target pixel using N processing elements in the array processor under the control of the calculation control signal based on a width scaling factor, a height scaling factor, and channel values of N channels in extracted pixel data. The embodiment has improved the processing speed of an image scaling operation.
-
公开(公告)号:US11360915B2
公开(公告)日:2022-06-14
申请号:US16904856
申请日:2020-06-18
Inventor: Xianglun Leng , Ningyi Xu , Yang Yan , Zhengze Qiu , Wei Qi
IPC: G06F3/06 , G06F13/16 , H04L41/0896
Abstract: According to embodiments of the present disclosure, there is provided a data transmission apparatus. The data transmission apparatus includes a plurality of first ports, a plurality of second ports, and a plurality of data channels. The plurality of first ports are coupled to a processing unit. The plurality of second ports are coupled to a plurality of memories. The plurality of data channels are disposed among the first ports and the second ports to form an interleaving network having a plurality of layers, and configured to transmit data among the processing unit and the plurality of memories, such that each layer in the interleaving network includes at least one interleaving sub-network.
-
公开(公告)号:US11087203B2
公开(公告)日:2021-08-10
申请号:US15618415
申请日:2017-06-09
Inventor: Yong Wang , Jian Ouyang , Wei Qi , Sizhong Li
Abstract: The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
-
公开(公告)号:US11023801B2
公开(公告)日:2021-06-01
申请号:US15618817
申请日:2017-06-09
Inventor: Jian Ouyang , Wei Qi , Yong Wang , Lin Liu
Abstract: The present application discloses a data processing method and apparatus. A specific implementation of the method includes: receiving floating point data sent from an electronic device; converting the received floating point data into fixed point data according to a data length and a value range of the received floating point data; performing calculation on the obtained fixed point data according to a preset algorithm to obtain result data in a fixed point form; and converting the obtained result data in the fixed point form into result data in a floating point form and sending the result data in the floating point form to the electronic device. This implementation improves the data processing efficiency.
-
公开(公告)号:US20210049045A1
公开(公告)日:2021-02-18
申请号:US16809020
申请日:2020-03-04
Inventor: Xianglun Leng , Zhibiao Zhao , Jinchen Han , Jian Ouyang , Wei Qi , Yong Wang
Abstract: Embodiments of the present disclosure relate to a method and apparatus for resource management, an electronic device, and a computer-readable storage medium. The method may include: determining a plurality of virtual functions to be supported, where each of the plurality of virtual functions corresponds to a virtual machine running on a computing device. The method may further include: dividing a physical resource set into a plurality of physical resource subsets according to a predetermined ratio, a number of the physical resource subsets being identical to a number of the virtual functions. The method may further include: allocating the plurality of physical resource subsets to the plurality of virtual functions respectively.
-
公开(公告)号:US20180129933A1
公开(公告)日:2018-05-10
申请号:US15618415
申请日:2017-06-09
Inventor: Yong Wang , Jian Ouyang , Wei Qi , Sizhong Li
CPC classification number: G06N3/0445 , G06N3/063
Abstract: The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
-
公开(公告)号:US20180107630A1
公开(公告)日:2018-04-19
申请号:US15590798
申请日:2017-05-09
Inventor: Ni Zhou , Wei Qi , Yong Wang , Jian Ouyang
CPC classification number: G06F17/16 , G06F9/3895 , G06N99/005
Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n-1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
-
-
-
-
-
-
-
-
-