-
公开(公告)号:US20230122168A1
公开(公告)日:2023-04-20
申请号:US17759838
申请日:2021-01-29
摘要: Accurate function estimations and well-calibrated uncertainties are important for Bayesian optimization (BO). Most theoretical guarantees for BO are established for methods that model the objective function with a surrogate drawn from a Gaussian process (GP) prior. GP priors are poorly-suited for discrete, high-dimensional, combinatorial spaces, such as biopolymer sequences. Using a neural network (NN) as the surrogate function can obtain more accurate function estimates. Using a NN can allow arbitrarily complex models, removing the GP prior assumption, and enable easy pretraining, which is beneficial in the low-data BO regime. However, a fully-Bayesian treatment of uncertainty in NNs remains intractable, and existing approximate methods, like Monte Carlo dropout and variational inference, can highly miscalibrate uncertainty estimates. Conformal Inference Optimization (CI-OPT) uses confidence intervals calculated using conformal inference as a replacement for posterior uncertainties in certain BO acquisition functions. A conformal scoring function with properties amenable for optimization is effective on standard BO datasets and real-world protein datasets.
-
公开(公告)号:US20220270711A1
公开(公告)日:2022-08-25
申请号:US17597844
申请日:2020-07-31
摘要: Systems, apparatuses, software, and methods for engineering amino acid sequences configured to have specific protein functions or properties. Machine learning is implemented by methods to process an input seed sequence and generate as output an optimized sequence having the desired function or property.
-
公开(公告)号:US20220122692A1
公开(公告)日:2022-04-21
申请号:US17428356
申请日:2020-02-10
摘要: Systems, apparatuses, software, and methods for identifying associations between amino acid sequences and protein functions or properties. The application of machine learning is used to generate models that identify such associations based on input data such as amino acid sequence information. Various techniques including transfer learning can be utilized to enhance the accuracy of the associations.
-
公开(公告)号:US20220396813A1
公开(公告)日:2022-12-15
申请号:US17577942
申请日:2022-01-18
发明人: Jacob Feala , Yanfang Fu , Jacob Rosenblum Rubens , Robert James Citorik , Michael Travis Mee , Molly Krisann Gibson
摘要: Methods and compositions for modulating a target genome are disclosed.
-
-
-