-
公开(公告)号:US20240086423A1
公开(公告)日:2024-03-14
申请号:US17898236
申请日:2022-08-29
Applicant: X Development LLC
Inventor: Lance Co Ting Keh , Ivan Grubisic , Ryan Poplin , Jon Deaton , Hayley Weir
IPC: G06F16/28 , G06N3/0455 , G06N3/08
CPC classification number: G06F16/285 , G06N3/0455 , G06N3/08
Abstract: Some techniques relate to projecting aptamer representations into an embedding space and clustering the representations. A cluster-specific binding metric can be defined for each cluster based on aptamer-specific binding metrics of aptamers associated with the cluster. A subset of the clusters can be selected based on the cluster-specific binding metrics. Identifications of aptamers assigned to the subset of clusters can then be output.
-
公开(公告)号:US20230101523A1
公开(公告)日:2023-03-30
申请号:US17936181
申请日:2022-09-28
Applicant: X Development LLC
Inventor: Ryan Poplin , Lance Co Ting Keh , Ivan Grubisic , Ray Nagatani
Abstract: The present disclosure relates to in vitro experiments and in silico computation and machine-learning based techniques to iteratively improve a process for identifying binders that can bind a target. Particularly, aspects of the present disclosure are directed to obtaining initial sequence data, identifying, by a first machine-learning model having model parameters learned from the initial sequence data, a first set of aptamer sequences, obtaining, using an in vitro binding selection process, subsequent sequence data including sequences from the first set of aptamer sequences, identifying, by a second machine-learning model having model parameters learned from the subsequent sequence data, a second set of aptamer sequences, determining, using one or more in vitro assays, analytical data for aptamers synthesized from the second set of aptamer sequences, and identifying a final set of aptamer sequences from the second set of aptamer sequences based on the analytical data associated with each aptamer.
-
公开(公告)号:US20230081439A1
公开(公告)日:2023-03-16
申请号:US17471903
申请日:2021-09-10
Applicant: X Development LLC
Inventor: Ryan Poplin , Ivan Grubisic , Lance Co Ting Keh , Ray Nagatani
Abstract: A latent space is defined to represent sequences using training data and a machine-learning model. The training data identifies sequences of molecules and binding-approximation metrics that characterizes whether the molecules bind to a particular target and/or that approximate an extent to which the molecule is more likely to bind to the particular target than some other molecules. Supplemental training data is accessed that identifies other sequences of other molecules and binding affinity scores quantifying binding strengths between the molecules and the particular target. Projections of representations of the other sequences in the supplemental training data are projected in the latent space using the binding affinity scores. An area or position of interest within the latent space is identified based on the projections. A particular sequence represented within or at the area or position of interest or at the position of interest is identified for downstream processing.
-
4.
公开(公告)号:US20220380753A1
公开(公告)日:2022-12-01
申请号:US17333272
申请日:2021-05-28
Applicant: X Development LLC
Inventor: Ivan Grubisic , Ray Nagatani , Lance Co Ting Keh , Andrew Weitz , Kenneth Jung , Ryan Poplin
Abstract: The present disclosure relates to in vitro experiments and in silico computation and machine-learning based techniques to iteratively improve a process for identifying binders that can bind any given molecular target. Particularly, aspects of the present disclosure are directed to obtaining sequence data for aptamers that bind to a target, where the sequence data has a first signal to noise ratio, generating, by a search process, a first set of aptamer sequences derived from the sequence data, obtaining subsequent sequence data for subsequent aptamers that bind to the target, where the subsequent aptamers includes aptamers synthesized from the first set of aptamer sequences, and the subsequent sequence data has a second signal to noise ratio greater than the first signal to noise ratio, generating, by a linear machine-learning model, a second set of aptamer sequences derived from the subsequent sequence data, and outputting the second set of aptamer sequences.
-
5.
公开(公告)号:US20240087682A1
公开(公告)日:2024-03-14
申请号:US17932153
申请日:2022-09-14
Applicant: X Development LLC
Inventor: Jon Deaton , Hayley Weir , Ryan Poplin , Ivan Grubisic
Abstract: A multi-dimensional latent space (defined by an Encoder model) corresponds to projections of sequences of aptamers. An architecture of the Encoder model, a hyperparameter of the Encoder model, or a characteristic of a training data set used to train the Encoder model was selected using an assessment of an encoding-efficiency of the Encoder model that is based on: a predicted extents to which representations in an embedding space are indicative of specific aptamer sequences to which a probability distribution of the embedding space differs from a probability distribution of a source space that represents individual base-pairs; generating projections in the latent space using representations of aptamers and the Encoder model; identifying one or more candidate aptamers for the particular target using the projections and the Decoder model; and outputting an identification of the one or more candidate aptamers.
-
6.
公开(公告)号:US20220383981A1
公开(公告)日:2022-12-01
申请号:US17333287
申请日:2021-05-28
Applicant: X Development LLC
Inventor: Ivan Grubisic , Ray Nagatani , Lance Co Ting Keh , Andrew Weitz , Kenneth Jung , Ryan Poplin
Abstract: The present disclosure relates to in vitro experiments and in silico computation and machine-learning based techniques to iteratively improve a process for identifying binders that can bind any given molecular target. Particularly, aspects of the present disclosure are directed to obtaining initial sequence data for aptamers that bind to a target, measuring a first signal to noise ratio within the initial sequence data, provisioning, based on the first signal to noise ratio, a first machine-learning system, generating, by the first machine-learning system, a first set of aptamer sequences, obtaining subsequent sequence data for aptamers that bind to the target, measuring a second signal to noise ratio within the subsequent sequence data, provisioning, based on the second signal to noise ratio, a second machine-learning system, generating, by the second machine-learning system, a second set of aptamer sequences, and outputting the second set of aptamer sequences.
-
-
-
-
-