-
公开(公告)号:US20210241095A1
公开(公告)日:2021-08-05
申请号:US17017600
申请日:2020-09-10
Inventor: Xiaozhang Gong , Jian Ouyang , Jing Wang , Wei Qi
Abstract: Embodiments of the present disclosure propose a deep learning processing apparatus and method, device and storage medium, relating to the field of artificial intelligence. A deep learning processing apparatus includes: at least one matrix multiply-add module, configured to perform a matrix multiply-add operation of a convolution kernel parameter value matrix of a convolutional layer in a convolutional neural network and a first error gradient value matrix to obtain a plurality of intermediate matrices; a storage apparatus, configured to store the plurality of intermediate matrices without reshaping elements in the plurality of intermediate matrices; and a plurality of matrix accumulation modules, configured to read the plurality of intermediate matrices from the storage apparatus and perform a matrix accumulation operation based on the plurality of intermediate matrices according to a convolution scheme of the convolutional layer in parallel, to obtain a second error gradient value matrix for the convolutional layer.
-
公开(公告)号:US20200050450A1
公开(公告)日:2020-02-13
申请号:US16458381
申请日:2019-07-01
Inventor: Jing Wang , Wei Qi , Yupeng Li , Xiaozhang Gong
Abstract: Embodiments of the present disclosure relate to a method and apparatus for executing an instruction. A method may include: acquiring an instruction queue; acquiring a to-be-sent instruction from the instruction queue in preset order, and executing following sending: determining a type of the to-be-sent instruction; determining, in response to determining that the to-be-sent instruction is an arithmetic instruction, an executing component executing the to-be-sent instruction from an executing component set, and sending the to-be-sent instruction to the determined executing component; and acquiring, in response to determining that the to-be-sent instruction is a blocking instruction, a next to-be-sent instruction after receiving a signal for instructing an instruction associated with the to-be-sent instruction being completely executed.
-
公开(公告)号:US12141228B2
公开(公告)日:2024-11-12
申请号:US17017600
申请日:2020-09-10
Inventor: Xiaozhang Gong , Jian Ouyang , Jing Wang , Wei Qi
Abstract: Embodiments of the present disclosure propose a deep learning processing apparatus and method, device and storage medium, relating to the field of artificial intelligence. A deep learning processing apparatus includes: at least one matrix multiply-add module, configured to perform a matrix multiply-add operation of a convolution kernel parameter value matrix of a convolutional layer in a convolutional neural network and a first error gradient value matrix to obtain a plurality of intermediate matrices; a storage apparatus, configured to store the plurality of intermediate matrices without reshaping elements in the plurality of intermediate matrices; and a plurality of matrix accumulation modules, configured to read the plurality of intermediate matrices from the storage apparatus and perform a matrix accumulation operation based on the plurality of intermediate matrices according to a convolution scheme of the convolutional layer in parallel, to obtain a second error gradient value matrix for the convolutional layer.
-
公开(公告)号:US11163714B2
公开(公告)日:2021-11-02
申请号:US16711196
申请日:2019-12-11
Inventor: Xianglun Leng , Hefei Zhu , Qingshu Chen , Zhibiao Zhao , Xiaozhang Gong
Abstract: Embodiments of the present disclosure relate to a method, an apparatus, an electronic device and a computer readable storage medium for determining connection relationships among a plurality of chips. The method includes determining identity information of a plurality of chips managed by a host, the plurality of chips being connected by respective inter-chip communication interfaces for inter-chip communication. The method further includes allowing one or more of the plurality of chips to acquire identity information of other chips connected to the inter-chip communication interface of the one or more chips. The method further includes reading identity information of the other chips by means of a management interface of the one or more chips with regard to communicating with the host, so as to determine connection relationships among the plurality of chips.
-
公开(公告)号:US11520563B2
公开(公告)日:2022-12-06
申请号:US16711277
申请日:2019-12-11
Inventor: Xiaozhang Gong
Abstract: Disclosed are an apparatus and method for transforming a matrix, and a data processing system. The apparatus may include: a first shift unit, configured to receive matrix data and perform first cyclic shift on the matrix data to generate first data; a cache unit, configured to write each row of data into the cache unit in the first data thereto in an order different from the order of respective data in the row of data to store the first data as second data; and a second shift unit, configured to read the second data from the cache unit and perform second cyclic shift on the second data to generate transformed matrix data.
-
公开(公告)号:US11093388B2
公开(公告)日:2021-08-17
申请号:US16682868
申请日:2019-11-13
Inventor: Xiaozhang Gong , Jing Wang
IPC: G06F12/06
Abstract: The present disclosure relates to a method, an apparatus, an electronic device and a computer readable storage medium for accessing static random access memories. The method includes: receiving an access request for data associated with the static random access memories; writing a plurality of sections of the data into a plurality of different static random access memories in an interleaved manner in response to the access request being a write request for the data, each of the plurality of sections having its respective predetermined size; and reading the plurality of sections of the data from the plurality of static random access memories in an interleaved manner in response to the access request being a read request for the data, each of the plurality of sections having its respective predetermined size.
-
-
-
-
-