-
公开(公告)号:US20210241853A1
公开(公告)日:2021-08-05
申请号:US17168050
申请日:2021-02-04
Applicant: 10X Genomics, Inc.
Inventor: Nicolaus Lance Hepler , Chaitanya Aluru , Patrick J. Marks , Niranjan Srinivas , Nigel Delaney
Abstract: Methods for index hopping sequence read filtering are provided. Each read in a plurality of reads from a multiplexed reaction comprises an insert portion, and first (molecular identifier) and second (sample index) non-insert portions. For each of a plurality of hashes, a hash data structure is formed with a representation of each read. Each representation comprises a hash of the first non-insert portion of the corresponding read. Read pairs are identified in the hash data structures. Each pair includes a first and second read sharing a common hash value but differing index values. An entry is added into a heterogeneous data structure, for each such pair, that includes the first and second non-insert portions of the first and second reads of the pair. Reads with first non-insert portion values appearing more than a threshold number of times in the heterogeneous data structure are removed from the plurality of reads.
-
公开(公告)号:US12112833B2
公开(公告)日:2024-10-08
申请号:US17168050
申请日:2021-02-04
Applicant: 10X Genomics, Inc.
Inventor: Nicolaus Lance Hepler , Chaitanya Aluru , Patrick J. Marks , Niranjan Srinivas , Nigel Delaney
CPC classification number: G16B30/00 , G06F16/2255 , G06F16/2365
Abstract: Methods for index hopping sequence read filtering are provided. Each read in a plurality of reads from a multiplexed reaction comprises an insert portion, and first (molecular identifier) and second (sample index) non-insert portions. For each of a plurality of hashes, a hash data structure is formed with a representation of each read. Each representation comprises a hash of the first non-insert portion of the corresponding read. Read pairs are identified in the hash data structures. Each pair includes a first and second read sharing a common hash value but differing index values. An entry is added into a heterogeneous data structure, for each such pair, that includes the first and second non-insert portions of the first and second reads of the pair. Reads with first non-insert portion values appearing more than a threshold number of times in the heterogeneous data structure are removed from the plurality of reads.
-