Identifying biosynthetic gene clusters

    公开(公告)号:US12009060B2

    公开(公告)日:2024-06-11

    申请号:US16362236

    申请日:2019-03-22

    CPC classification number: G16B30/00 G06N3/044 G06N3/045

    Abstract: A BGC prediction system identifies candidate biosynthetic gene clusters (BGCs) within genomes using machine-learned models, such as a shallow neural network and recurrent neural network (RNN). A set of domains within a genome sequence are identified, each domain corresponds to a set of domain identifiers. A shallow neural network block is applied to each set of domain identifiers to produce a set of vectors. An RNN block is applied to the set of vectors to produce a BGC class score for each domain. The RNN block was trained using an identified set of positive vectors, which represents known BGCs, and a synthesized set of negative vectors, which is unlikely to represent BGCs. Candidate BGCs are selected by averaging BGC class scores across genes within a domain and comparing the average BGC class scores to a threshold. The candidate BGCs are provided for display on a user interface.

    CLINICAL INVESTIGATION TIMELINESS PREDICTOR
    4.
    发明公开

    公开(公告)号:US20240095636A1

    公开(公告)日:2024-03-21

    申请号:US18465645

    申请日:2023-09-12

    CPC classification number: G06Q10/0635

    Abstract: A clinical investigation management system monitors clinical investigations performed across departments and clinical investigators. The system employs a method for predicting timeliness in completion of clinical investigations. The method includes monitoring data of a clinical investigation performed by a clinical investigator. The method includes applying a timeliness model to the data to determine a timeliness prediction of the clinical investigation. The method includes identifying one or more interventive actions based on the timeliness prediction. The method includes generating a notification including the timeliness prediction and the identified one or more interventive actions. The method includes transmitting the notification to a client device of a supervisor.

    Automated quality check and diagnosis for production model refresh

    公开(公告)号:US11605025B2

    公开(公告)日:2023-03-14

    申请号:US16874232

    申请日:2020-05-14

    Abstract: As a data science project goes into the production stage, model maintenance to maintain model quality and predictive accuracy becomes a concern. Manual model maintenance by data scientists can become a time- and labor-intensive process, especially for large scale data science projects. An early warning system addresses this by performing systematic statistical and algorithmic checks for prediction accuracy, stability, and model assumption validity. A diagnostic report is generated that helps data scientists to assess the health of the model and identify sources of error as needed. Well-performing models can be automatically deployed without further human intervention while poor performing models trigger a warning or alert to the data scientists for further investigation and may be removed from production until the performance issues are addressed.

    IDENTIFYING BIOSYNTHETIC GENE CLUSTERS
    6.
    发明公开

    公开(公告)号:US20240290429A1

    公开(公告)日:2024-08-29

    申请号:US18654581

    申请日:2024-05-03

    CPC classification number: G16B30/00 G06N3/044 G06N3/045

    Abstract: A biosynthetic gene cluster (BGC) prediction system identifies candidate BGCs within genomes using an iteratively trained machine-learned model. The system identifies, in a genome sequence, a set of domains, each identified domain corresponding to a set of domain identifiers. The set of domain identifiers corresponds to a set of vectors. The iteratively trained model is applied to the set of vectors to produce a BGC class score for each domain. The system selects candidate BGCs by averaging GBC class scores across genes within a domain and comparing the average BGC class scores to a threshold. The system predicts a molecular activity of biosynthetic products derived from the selected BGCs, and provides for display, on a user interface, the candidate BGCs and predicted molecular activity.

Patent Agency Ranking