Systems and methods for index hopping filtering

    公开(公告)号:US12112833B2

    公开(公告)日:2024-10-08

    申请号:US17168050

    申请日:2021-02-04

    CPC classification number: G16B30/00 G06F16/2255 G06F16/2365

    Abstract: Methods for index hopping sequence read filtering are provided. Each read in a plurality of reads from a multiplexed reaction comprises an insert portion, and first (molecular identifier) and second (sample index) non-insert portions. For each of a plurality of hashes, a hash data structure is formed with a representation of each read. Each representation comprises a hash of the first non-insert portion of the corresponding read. Read pairs are identified in the hash data structures. Each pair includes a first and second read sharing a common hash value but differing index values. An entry is added into a heterogeneous data structure, for each such pair, that includes the first and second non-insert portions of the first and second reads of the pair. Reads with first non-insert portion values appearing more than a threshold number of times in the heterogeneous data structure are removed from the plurality of reads.

    SYSTEMS AND METHODS FOR INDEX HOPPING FILTERING

    公开(公告)号:US20210241853A1

    公开(公告)日:2021-08-05

    申请号:US17168050

    申请日:2021-02-04

    Abstract: Methods for index hopping sequence read filtering are provided. Each read in a plurality of reads from a multiplexed reaction comprises an insert portion, and first (molecular identifier) and second (sample index) non-insert portions. For each of a plurality of hashes, a hash data structure is formed with a representation of each read. Each representation comprises a hash of the first non-insert portion of the corresponding read. Read pairs are identified in the hash data structures. Each pair includes a first and second read sharing a common hash value but differing index values. An entry is added into a heterogeneous data structure, for each such pair, that includes the first and second non-insert portions of the first and second reads of the pair. Reads with first non-insert portion values appearing more than a threshold number of times in the heterogeneous data structure are removed from the plurality of reads.

Patent Agency Ranking