FACILITATION OF APTAMER SEQUENCE DESIGN USING ENCODING EFFICIENCY TO GUIDE CHOICE OF GENERATIVE MODELS

    公开(公告)号:US20240087682A1

    公开(公告)日:2024-03-14

    申请号:US17932153

    申请日:2022-09-14

    CPC classification number: G16B40/00 G16B50/50

    Abstract: A multi-dimensional latent space (defined by an Encoder model) corresponds to projections of sequences of aptamers. An architecture of the Encoder model, a hyperparameter of the Encoder model, or a characteristic of a training data set used to train the Encoder model was selected using an assessment of an encoding-efficiency of the Encoder model that is based on: a predicted extents to which representations in an embedding space are indicative of specific aptamer sequences to which a probability distribution of the embedding space differs from a probability distribution of a source space that represents individual base-pairs; generating projections in the latent space using representations of aptamers and the Encoder model; identifying one or more candidate aptamers for the particular target using the projections and the Decoder model; and outputting an identification of the one or more candidate aptamers.

Patent Agency Ranking