SYSTEMS AND METHODS FOR IDENTIFYING DIFFERENTIAL ACCESSIBILITY OF GENE REGULATORY ELEMENTS AT SINGLE CELL RESOLUTION

    公开(公告)号:US20210332354A1

    公开(公告)日:2021-10-28

    申请号:US17231972

    申请日:2021-04-15

    Abstract: A method for ascertaining differential accessibility of (TF) binding motifs in open chromatin regions of cells, comprising receiving a cell barcode genomic sequence dataset; aligning each of a plurality of fragment sequence reads to a reference sequence; identifying peaks defined by the aligned plurality of fragment sequence reads; generating a peak-barcode matrix that is comprised of peaks for each cell barcode; clustering cells with peaks having similar chromatin accessibility profiles into a cell cluster to form one or more cell clusters; generating a TF barcode matrix that maps each peak in the peak-barcode matrix to one or more given TF binding motif(s); performing a differential accessibility analysis, wherein the analysis identifies differences in accessibility of peaks and TF binding motifs associated with each identified cell cluster relative to all other identified cell clusters; and generating an output of one or more refined cell clusters based on the differential accessibility analysis.

    SYSTEMS AND METHODS FOR IDENTIFYING FEATURE LINKAGES IN MULTI-GENOMIC FEATURE DATA FROM SINGLE-CELL PARTITIONS

    公开(公告)号:US20220076784A1

    公开(公告)日:2022-03-10

    申请号:US17465725

    申请日:2021-09-02

    Inventor: Li Wang Yiming Kang

    Abstract: Methods and systems for generating linkage correlations and linkage significances between a first genomic feature and a second genomic feature identified for each of a plurality of cells may be provided. For example, the method may comprise receiving a data matrix comprising a first genomic feature and a second genomic feature identified for each of a plurality of cells; smoothing the data matrix to generate a smoothed matrix; generating linkage correlations between the first genomic feature and second genomic feature identified for each of the plurality of cells in the data matrix; generating linkage significances using multiplication of a plurality of linkage matrixes; and outputting the linkage correlations and linkage significances for each of the plurality of cells in the data matrix.

    SYSTEMS AND METHODS FOR CORRECTING SAMPLE PREPARATION ARTIFACTS IN DROPLET-BASED SEQUENCING

    公开(公告)号:US20210324454A1

    公开(公告)日:2021-10-21

    申请号:US17232058

    申请日:2021-04-15

    Abstract: A method for filtering open chromatin regions on a cell barcode genomic sequence dataset is provided, comprising receiving, by one or more processors, a cell barcode genomic sequence dataset, the method comprising a plurality of fragment sequence reads and barcodes associated with the plurality of fragment sequence reads. The method further comprising generating, by the one or more processors, an adjacency matrix that counts up pairs of adjacent fragment sequence reads and barcodes associated with each fragment sequence read. The method further comprising identifying, by the one or more processors, pairs of adjacent fragment sequence reads with different barcodes and annotating the pair as a multiplet pair. The method further comprising filtering, by the one or more processors, one fragment sequence read from each of the identified multiplet pairs. The method further comprising generating, by the one or more processors, a multiplet filtered cell barcode genomic sequence dataset.

Patent Agency Ranking