-
公开(公告)号:US20240037394A1
公开(公告)日:2024-02-01
申请号:US18360140
申请日:2023-07-27
Applicant: Deliang Fan , Fan Zhang , Li Yang
Inventor: Deliang Fan , Fan Zhang , Li Yang
Abstract: A neural network accelerator architecture for multiple task adaptation comprises a volatile memory comprising a plurality of subarrays, each subarray comprising M rows and N columns of volatile memory cells; a source line driver connected to a plurality of N source lines, each source line corresponding to a column in the subarray; a binary mask buffer memory having size at least N bits, each bit corresponding to a column in the subarray, where a 0 corresponds to turning off the column for a convolution operation and a 1 corresponds to turning on the column for the convolution operation; and a controller configured to selectively drive each of the N source lines with a corresponding value from the mask buffer; wherein each column in the subarray is configured to store a convolution kernel.
-
2.
公开(公告)号:US20230342604A1
公开(公告)日:2023-10-26
申请号:US18305097
申请日:2023-04-21
Applicant: Li Yang , Deliang Fan , Adnan Siraj Rakin
Inventor: Li Yang , Deliang Fan , Adnan Siraj Rakin
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Dynamic additive attention adaption for memory-efficient multi-domain on-device learning is provided. Almost all conventional methods for multi-domain learning in deep neural networks (DNNs) only focus on improving accuracy with minimal parameter update, while ignoring high computing and memory cost during training. This makes it difficult to deploy multi-domain learning into resource-limited edge devices, like mobile phones, internet-of-things (IoT) devices, embedded systems, and so on. To reduce training memory usage, while keeping the domain adaption accuracy performance, Dynamic Additive Attention Adaption (DA3) is proposed as a novel memory-efficient on-device multi-domain learning approach. Embodiments of DA3 learn a novel additive attention adaptor module, while freezing the weights of the pre-trained backbone model for each domain. This module not only mitigates activation memory buffering for reducing memory usage during training, but also serves as a dynamic gating mechanism to reduce the computation cost for fast inference.
-
公开(公告)号:US20220318628A1
公开(公告)日:2022-10-06
申请号:US17714677
申请日:2022-04-06
Applicant: Sai Kiran Cherupally , Jian Meng , Shihui Yin , Deliang Fan , Jae?sun Seo
Inventor: Sai Kiran Cherupally , Jian Meng , Shihui Yin , Deliang Fan , Jae?sun Seo
Abstract: Hardware noise-aware training for improving accuracy of in-memory computing (IMC)-based deep neural network (DNN) hardware is provided. DNNs have been very successful in large-scale recognition tasks, but they exhibit large computation and memory requirements. To address the memory bottleneck of digital DNN hardware accelerators, IMC designs have been presented to perform analog DNN computations inside the memory. Recent IMC designs have demonstrated high energy-efficiency, but this is achieved by trading off the noise margin, which can degrade the DNN inference accuracy. The present disclosure proposes hardware noise-aware DNN training to largely improve the DNN inference accuracy of IMC hardware. During DNN training, embodiments perform noise injection at the partial sum level, which matches with the crossbar structure of IMC hardware, and the injected noise data is directly based on measurements of actual IMC prototype chips.
-
4.
公开(公告)号:US20240232718A9
公开(公告)日:2024-07-11
申请号:US18494330
申请日:2023-10-25
Applicant: Jae-sun Seo , Jian Meng , Li Yang , Deliang Fan
Inventor: Jae-sun Seo , Jian Meng , Li Yang , Deliang Fan
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: A method of training a machine learning algorithm comprises providing a set of input data, performing transforms on the input data to generate augmented data, to provide transformed base paths into machine learning algorithm encoders, segmenting the augmented data, calculating main base path outputs by applying a weighting to the segmented augmented data, calculating pruning masks from the input and augmented data to apply to the base paths of the machine learning algorithm encoders, the pruning masks having a binary value for each segment in the segmented augmented data, calculating sparse conditional path outputs by performing a computation on the segments of the segmented augmented data, and calculating a final output as a sum of the main base path outputs and the sparse conditional path outputs. A computer-implemented system for learning sparse features of a dataset is also disclosed.
-
公开(公告)号:US20240145036A1
公开(公告)日:2024-05-02
申请号:US18187203
申请日:2023-03-21
Applicant: Deliang Fan , Fan Zhang , Shaahin Angizi
Inventor: Deliang Fan , Fan Zhang , Shaahin Angizi
Abstract: A method of calculating an abundance of an mRNA sequence within a gene comprises storing an index table of the gene in a non-volatile memory, obtaining a short read of the mRNA sequence, generating a set of input fragments from the mRNA sequence, initializing a compatibility table in a volatile memory, for each input fragment in the set of input fragments, searching for an exact match of the input fragment in the index table, calculating a final result from the compatibility table, and calculating an abundance of the mRNA sequence in the gene by aggregating the transcripts compatible with the short read, wherein the calculating step is performed on the same integrated circuit as the non-volatile memory. A system for in-memory calculation of an abundance of an mRNA sequence within a gene is also disclosed.
-
公开(公告)号:US20240095528A1
公开(公告)日:2024-03-21
申请号:US18463778
申请日:2023-09-08
Applicant: Jae-sun Seo , Jian Meng , Li Yang , Deliang Fan
Inventor: Jae-sun Seo , Jian Meng , Li Yang , Deliang Fan
IPC: G06N3/08 , G06N3/0495
CPC classification number: G06N3/08 , G06N3/0495
Abstract: A method for increasing the temperature-resiliency of a neural network, the method comprising loading a neural network model into a resistive nonvolatile in-memory-computing chip, training the deep neural network model using a progressive knowledge distillation algorithm as a function of a teacher model, the algorithm comprising injecting, using a clean model as the teacher model, low-temperature noise values into a student model and changing, now using the student model as the teacher model, the low-temperature noises to high-temperature noises, and training the deep neural network model using a batch normalization adaptation algorithm, wherein the batch normalization adaptation algorithm includes training a plurality of batch normalization parameters with respect to a plurality of thermal variations.
-
公开(公告)号:US20230078473A1
公开(公告)日:2023-03-16
申请号:US17932104
申请日:2022-09-14
Applicant: Deliang Fan , Adnan Siraj Rakin , Li Yang , Chaitali Chakrabarti , Yu Cao , Jae-sun Seo , Jingtao Li
Inventor: Deliang Fan , Adnan Siraj Rakin , Li Yang , Chaitali Chakrabarti , Yu Cao , Jae-sun Seo , Jingtao Li
Abstract: A robust and accurate binary neural network, referred to as RA-BNN, is provided to simultaneously defend against adversarial noise injection and improve accuracy. Recently developed adversarial weight attack, a.k.a. bit-flip attack (BFA), has shown enormous success in compromising deep neural network (DNN) performance with an extremely small amount of model parameter perturbation. To defend against this threat, embodiments of RA-BNN adopt a complete binary neural network (BNN) to significantly improve DNN model robustness (defined as the number of bit-flips required to degrade the accuracy to as low as a random guess). To improve clean inference accuracy, a novel and efficient two-stage network growing method is proposed and referred to as early growth. Early growth selectively grows the channel size of each BNN layer based on channel-wise binary masks training with Gumbel-Sigmoid function. Apart from recovering the inference accuracy, the RA-BNN after growing also shows significantly higher resistance to BFA.
-
公开(公告)号:US20240144998A1
公开(公告)日:2024-05-02
申请号:US18499643
申请日:2023-11-01
Applicant: Deliang Fan , Shaahin Angizi
Inventor: Deliang Fan , Shaahin Angizi
IPC: G11C11/419 , G11C11/412 , H03K19/20
CPC classification number: G11C11/419 , G11C11/412 , H03K19/20
Abstract: A system for in-memory computing comprises a volatile memory comprising at least a first layered subarray, wherein each subarray comprises a plurality of memory cells, and a plurality of sub-sense amplifiers connected to a read bitline of the first subarray of the memory, configured to compare a measured voltage of the read bitline to at least one threshold and provide at least one binary output corresponding to a logic operation based on whether the voltage of the read bitline is above or below the threshold. A method for in-memory computing is also disclosed.
-
公开(公告)号:US20230297331A1
公开(公告)日:2023-09-21
申请号:US18187189
申请日:2023-03-21
Applicant: Deliang Fan , Fan Zhang , Shaahin Angizi
Inventor: Deliang Fan , Fan Zhang , Shaahin Angizi
Abstract: A method of calculating a boundary value of a set of numerical values in a volatile memory comprises storing a set of numerical values in a volatile memory, initializing a comparison vector, initializing a matching vector, transpose-copying a first bit of each of the set of numerical values into a buffer, calculating a result vector, updating the matching vector, repeating the previous steps for each of the bits in the set of numerical values, and returning the matching vector, where the position of each 1 remaining in the matching vector corresponds to an index of the boundary value in the set of numerical values, wherein the computation and the memory storage take place on the same integrated circuit. A system for calculating a boundary value of a set of numerical values is also disclosed.
-
10.
公开(公告)号:US20240135256A1
公开(公告)日:2024-04-25
申请号:US18494330
申请日:2023-10-24
Applicant: Jae-sun Seo , Jian Meng , Li Yang , Deliang Fan
Inventor: Jae-sun Seo , Jian Meng , Li Yang , Deliang Fan
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: A method of training a machine learning algorithm comprises providing a set of input data, performing transforms on the input data to generate augmented data, to provide transformed base paths into machine learning algorithm encoders, segmenting the augmented data, calculating main base path outputs by applying a weighting to the segmented augmented data, calculating pruning masks from the input and augmented data to apply to the base paths of the machine learning algorithm encoders, the pruning masks having a binary value for each segment in the segmented augmented data, calculating sparse conditional path outputs by performing a computation on the segments of the segmented augmented data, and calculating a final output as a sum of the main base path outputs and the sparse conditional path outputs. A computer-implemented system for learning sparse features of a dataset is also disclosed.
-
-
-
-
-
-
-
-
-