AGGREGATING GENOME DATA INTO BINS WITH SUMMARY DATA AT VARIOUS LEVELS

    公开(公告)号:US20240203534A1

    公开(公告)日:2024-06-20

    申请号:US18391014

    申请日:2023-12-20

    申请人: Illumina, Inc.

    IPC分类号: G16B50/10 G16B30/00

    CPC分类号: G16B50/10 G16B30/00

    摘要: Systems, methods, and apparatus are described herein for aggregating genome data into bins with summary data at various levels. As described herein, a computing device may be configured to receive genome data associated with a genome. The computing device may be configured to generate an aggregate file using the received genome data. The aggregate file may include a plurality of bins at a plurality of depths. The computing device may be configured to determine summary data for respective reads associated with one or more respective portions of the genome covered by respective bins of the plurality of bins. The computing device may be configured to store the summary data for the respective reads in respective bins of the plurality of bins. The computing device may be configured to display a portion of the summary data in response to a selection of a genomic region by a user.

    SYSTEMS AND METHODS FOR ONTOLOGY MATCHING
    7.
    发明公开

    公开(公告)号:US20240087687A1

    公开(公告)日:2024-03-14

    申请号:US18463902

    申请日:2023-09-08

    申请人: Truveta, Inc.

    IPC分类号: G16B50/10 G06F16/31 G16B50/20

    摘要: Systems and methods for aligning ontologies, such as a medical or related ontologies, are disclosed. Initially, ontology specifications are received, such as ontologies comprising a root node and a plurality of child nodes. Each node is assigned at least one synthetic identifier corresponding to its path(s) to the root node. In some cases, nodes may be clustered using one or more clustering algorithms. A translation model is pre-trained by applying one or more masked language models to the ontologies and the synthetic identifiers. Subsequently, each ontology is augmented by identifying nodes in different ontologies that match and assigning label and/or other details across different ontologies. The translation model can then be fine-tuned using the augmented data. The fine-tuned translation model is then used to identify corresponding nodes in target ontologies in response to translation requests.