Methods and systems for predicting DNA accessibility in the pan-cancer genome
摘要:
Techniques are provided for predicting DNA accessibility. DNase-seq data files and RNA-seq data files for a plurality of cell types are paired by assigning DNase-seq data files to RNA-seq data files that are at least within a same biotype. A neural network is configured to be trained using batches of the paired data files, where configuring the neural network comprises configuring convolutional layers to process a first input comprising DNA sequence data from a paired data file to generate a convolved output, and fully connected layers following the convolutional layers to concatenate the convolved output with a second input comprising gene expression levels derived from RNA-seq data from the paired data file and process the concatenation to generate a DNA accessibility prediction output. The trained neural network is used to predict DNA accessibility in a genomic sample input comprising RNA-seq data and whole genome sequencing for a new cell type.
信息查询
0/0