-
1.
公开(公告)号:WO2022072878A1
公开(公告)日:2022-04-07
申请号:PCT/US2021/053235
申请日:2021-10-01
Applicant: INSCRIPTA, INC.
Inventor: HALWEG-EDWARDS, Andrea , HRAHA, Thomas , YERRAMSETTY, Krishna , LAMBERT, Shea , GANDER, Miles , ESTES, Matthew David , SANADA, Chad Douglas , WAGNER, Isaac David , HARDENBOL, Paul
Abstract: Disclosed systems and methods relate to predicting the relative representation of genomic variants in an edited cell population, based on the editing cassette design representation in an editing cassette design library used to generate the edited cell population. A library of editing cassette designs is generated, and a feature vector, or sequence embedding, is developed for each design using natural language processing techniques. The feature vector may be based upon sequence attributes and editing kinetics of each cassette design as well as attributes that describe the library context. Features may include sequence embeddings generated from a neural network, linguistic-type distances, and statistical distance summaries thereof. The feature vectors are classified using one or more machine learning models, and the classified feature vectors are used to predict the representation of each design an edited cell population.