-
公开(公告)号:US20210319289A1
公开(公告)日:2021-10-14
申请号:US16846966
申请日:2020-04-13
发明人: Wei HAN , Xiaoxin FAN , Yuhao WANG
摘要: The present disclosure relates to systems and methods concerning a system including a host device and a convolutional neural network hardware accelerator. The hardware accelerator can be configured, at least in part by the host device, to generate activation data from spatial-domain input data and spatial-domain weight data using frequency-domain operations. The hardware accelerator can include one or more discrete Fourier transform units configured to generate a frequency-domain representation of the input data. The hardware accelerator can include a multiplication unit configured to generate a frequency-domain representation of the activation data by element-wise complex multiplication of the frequency-domain representation of the input data and a frequency-domain representation of the weight data. The hardware accelerator can also include an inverse discrete Fourier transform unit configured to generate a spatial-domain representation of the activation data from the frequency-domain representation of the activation data.
-
公开(公告)号:US20210240684A1
公开(公告)日:2021-08-05
申请号:US16783069
申请日:2020-02-05
发明人: Zhibin XIAO , Xiaoxin FAN , Minghai QIN
IPC分类号: G06F16/22 , G06F17/16 , G06F16/174 , G06F9/30 , G06N3/02
摘要: The present disclosure relates to a method and an apparatus for representation of a sparse matrix in a neural network. In some embodiments, an exemplary operation unit includes a buffer for storing a representation of a sparse matrix in a neural network, a sparse engine communicatively coupled with the buffer, and a processing array communicatively coupled with the sparse engine. The sparse engine includes circuitry to: read the representation of the sparse matrix from the buffer, the representation comprising a first level bitmap, a second level bitmap, and an element array; decompress the first level bitmap to determine whether a block of the sparse matrix comprises a non-zero element; and in response to the block comprising a non-zero element, decompress the second level bitmap using the element array to obtain the block of the sparse matrix. The processing array includes circuitry to execute the neural network with the sparse matrix.
-
公开(公告)号:US20220101887A1
公开(公告)日:2022-03-31
申请号:US17037134
申请日:2020-09-29
发明人: Wei HAN , Shuangchen LI , Lide DUAN , Hongzhong ZHENG , Dimin NIU , Yuhao WANG , Xiaoxin FAN
IPC分类号: G11C5/06 , G11C11/408 , G11C11/4094
摘要: The systems and methods are configured to efficiently and effectively include processing capabilities in memory. In one embodiment, a processing in memory (PIM) chip a memory array, logic components, and an interconnection network. The memory array is configured to store information. In one exemplary implementation the memory array includes storage cells and array periphery components. The logic components can be configured to process information stored in the memory array. The interconnection network is configured to communicatively couple the logic components. The interconnection network can include interconnect wires, and a portion of the interconnect wires are located in a metal layer area that is located above the memory array.
-
公开(公告)号:US20210157647A1
公开(公告)日:2021-05-27
申请号:US16863954
申请日:2020-04-30
发明人: Shasha WEN , Pengcheng LI , Xiaoxin FAN , Li ZHAO
IPC分类号: G06F9/50 , G06F12/0882 , G06F12/0846 , G06F13/16 , G06F11/07
摘要: Remote access latency in a non-uniform memory access (NUMA) system is substantially reduced by monitoring which NUMA nodes are accessing which local memories, and migrating memory pages from the local memory in a first NUMA node to the local memory in a hot NUMA node when the hot NUMA node is frequently accessing the local memory in the first NUMA node.
-
-
-