-
公开(公告)号:WO2021252798A1
公开(公告)日:2021-12-16
申请号:PCT/US2021/036872
申请日:2021-06-10
Applicant: LIFE TECHNOLOGIES CORPORATION
Inventor: CHU, Yong , SCHNEIDER, Stephanie, Jo , SCHAEFFER, Rylan , WOO, David
IPC: G16B40/10 , G16B30/00 , G06N3/02 , C12Q1/6869 , G06K9/6201 , G06K9/623 , G06K9/6256 , G06K9/6277 , G06N3/0445 , G06N3/0454 , G06N3/0472 , G06N3/0481 , G06N3/082 , G06N3/084 , G06N3/088 , G06N5/046 , G16B25/00 , G16B40/00
Abstract: A method of automatically sequencing or basecalling one or more DNA (deoxyribonucleic acid) molecules of a biological sample is described. The method comprises using a capillary electrophoresis genetic analyzer to measure the biological sample to obtain at least one input trace comprising digital data corresponding to fluorescence values for a plurality of scans. Scan labelling probabilities for the plurality of scans are generated using a trained artificial neural network comprising a plurality of layers including convolutional layers. A basecall sequence comprising a plurality of basecalls for the one or more DNA molecules based on the scan labelling probabilities for the plurality of scans is determined.
-
公开(公告)号:WO2020123552A1
公开(公告)日:2020-06-18
申请号:PCT/US2019/065540
申请日:2019-12-10
Applicant: LIFE TECHNOLOGIES CORPORATION
Inventor: CHU, Yong , SCHNEIDER, Stephanie , SCHAEFFER, Rylan , WOO, David
Abstract: A deep basecaller system for Sanger sequencing and associated methods are provided. The methods use deep machine learning. A Deep Learning Model is used to determine scan labelling probabilities based on an analyzed trace. A Neural Network is trained to learn the optimal mapping function to minimize a Connectionist Temporal Classification (CTC) Loss function. The CTC function is used to calculate loss by matching a target sequence and predicted scan labelling probabilities. A Decoder generates a sequence with the maximum probability. A Basecall position finder using prefix beam search is used to walk through CTC labelling probabilities to find a scan range and then the scan a position of peak labelling probability within the scan range for each called base. A Quality Value (QV) is determined using a feature vector calculated from CTC labelling probabilities as an index into a QV look-up table to find a quality score.
-