-
公开(公告)号:WO2021188136A1
公开(公告)日:2021-09-23
申请号:PCT/US2020/040530
申请日:2020-07-01
Applicant: WESTERN DIGITAL TECHNOLOGIES, INC.
Inventor: KINNEY, Justin
Abstract: Methods and systems for processing a plurality of sample reads for genome sequencing include, for each sample read of the plurality of sample reads, comparing substring sequences from the sample read to reference sequences representing different portions of a reference genome, One or more reference sequences are identified that match one or more of the compared substring sequences, and a probabilistic location within the reference genome is determined for the sample read based on the one or more identified reference sequences. The plurality of sample reads is sorted into a plurality of sample groups based on the determined probabilistic locations of the respective sample reads.
-
公开(公告)号:WO2021262252A1
公开(公告)日:2021-12-30
申请号:PCT/US2021/014952
申请日:2021-01-25
Applicant: WESTERN DIGITAL TECHNOLOGIES, INC.
Inventor: MA, Wen , HOANG, Tung, Thanh , BEDAU, Daniel , KINNEY, Justin
IPC: H04L12/743 , G06F16/903 , C12Q1/6869 , G16B30/10 , G16B40/20
Abstract: A device includes arrays of Non-Volatile Memory (NVM) cells. Reference sequences representing portions of a genome are stored in respective groups of NVM cells. Exact matching phase substring sequences representing portions of at least one sample read are loaded into groups of NVM cells. One or more groups of NVM cells are identified where the stored reference sequence matches the loaded exact matching phase substring sequence using the arrays at Content Addressable Memories (CAMs). Approximate matching phase substring sequences are loaded into groups of NVM cells. One or more groups of NVM cells are identified where the stored reference sequence approximately matches the loaded approximate matching phase substring sequence using the arrays as Ternary CAMs (TCAMs). At least one of the reference sequence and the approximate matching phase substring sequence for each group of NVM cells includes at least one wildcard value when the arrays are used as TCAMs.
-
公开(公告)号:WO2021188138A1
公开(公告)日:2021-09-23
申请号:PCT/US2020/040570
申请日:2020-07-01
Applicant: WESTERN DIGITAL TECHNOLOGIES, INC.
Inventor: KINNEY, Justin
Abstract: A device for locating a sample read with respect to a reference genome includes a plurality of groups of cells. Each group of cells stores a reference sequence representing reference bases from the reference genome corresponding to an order of cells in the respective group of cells. Each group of cells further stores a current substring sequence representing sample bases from the sample read corresponding to the order of the cells in the respective group of cells. Each group of cells stores the same current substring sequence and a reference sequence representing a portion of the reference genome that partially overlaps at least one other portion of the reference genome represented by one or more other reference sequences stored in one or more other groups of cells. Groups of cells are identified among the plurality of groups of cells where the stored reference sequence matches the current substring sequence.
-
公开(公告)号:WO2021188137A1
公开(公告)日:2021-09-23
申请号:PCT/US2020/040568
申请日:2020-07-01
Applicant: WESTERN DIGITAL TECHNOLOGIES, INC.
Inventor: KINNEY, Justin
Abstract: Methods and systems for processing a plurality of sample reads for genome sequencing include, for each sample read of the plurality of sample reads, comparing substring sequences from the sample read to reference sequences representing different portions of a reference genome. One or more reference sequences are identified that match one or more of the compared substring sequences, and a probabilistic location within the reference genome is determined for the sample read based on the one or more identified reference sequences. The reference genome is partitioned for reference-aligned genome sequencing based on the determined probabilistic locations of the respective sample reads.
-
-
-