DATA DIVERSITY VISUALIZATION AND QUANTIFICATION FOR MACHINE LEARNING MODELS

    公开(公告)号:US20220351055A1

    公开(公告)日:2022-11-03

    申请号:US17243046

    申请日:2021-04-28

    Abstract: Systems and techniques that facilitate data diversity visualization and/or quantification for machine learning models are provided. In various embodiments, a processor can access a first dataset and a second dataset, where a machine learning (ML) model is trained on the first dataset. In various instances, the processor can obtain a first set of latent activations generated by the ML model based on the first dataset, and a second set of latent activations generated by the ML model based on the second dataset. In various aspects, the processor can generate a first set of compressed data points based on the first set of latent activations, and a second set of compressed data points based on the second set of latent activations, via dimensionality reduction. In various instances, a diversity component can compute a diversity score based on the first set of compressed data points and second set of compressed data points.

    MACHINE LEARNING MODEL DEVELOPMENT AND OPTIMIZATION PROCESS THAT ENSURES PERFORMANCE VALIDATION AND DATA SUFFICIENCY FOR REGULATORY APPROVAL

    公开(公告)号:US20230229972A1

    公开(公告)日:2023-07-20

    申请号:US18176985

    申请日:2023-03-01

    Inventor: Marc T. Edgar

    CPC classification number: G06N20/00 G06N5/04

    Abstract: Machine learning model development and optimization tools are provided that ensure performance validation and data sufficiency for regulatory approval. According to an embodiment, a computer implemented method can comprise training a machine learning model to perform an inferencing task on an initial set of data samples included in a sample population. In various embodiments, the model can include a medical AI model. The method further comprises determining, by the system, subgroup performance measures for subgroups of the data samples respectively associated with different metadata factors, wherein the subgroup performance measures reflect performance accuracy of the machine learning model with respect to the subgroups. The method further comprises determining, by the system, whether the machine learning model meets an acceptable level of performance for deployment in a field environment based on whether the subgroup performance measures respectively satisfy a threshold subgroup performance measure.

    ANNOTATION PIPELINE FOR MACHINE LEARNING ALGORITHM TRAINING AND OPTIMIZATION

    公开(公告)号:US20210035015A1

    公开(公告)日:2021-02-04

    申请号:US16528121

    申请日:2019-07-31

    Abstract: Techniques are provided for enhancing the efficiency and accuracy of annotating data samples for supervised machine learning algorithms using an advanced annotation pipeline. According to an embodiment, a method can comprise collecting, by a system comprising a processor, unannotated data samples for input to a machine learning model and storing the unannotated data samples in an annotation queue. The method further comprises determining, by the system, annotation priority levels for respective unannotated data samples of the unannotated data samples, selecting, by the system from amongst different annotation techniques, one or more of the different annotation techniques for annotating the respective unannotated data samples based the annotation priority levels associated with the respective unannotated data samples.

    Machine learning model development and optimization process that ensures performance validation and data sufficiency for regulatory approval

    公开(公告)号:US11610152B2

    公开(公告)日:2023-03-21

    申请号:US16728489

    申请日:2019-12-27

    Inventor: Marc T. Edgar

    Abstract: Machine learning model development and optimization tools are provided that ensure performance validation and data sufficiency for regulatory approval. According to an embodiment, a computer implemented method can comprise training a machine learning model to perform an inferencing task on an initial set of data samples included in a sample population. In various embodiments, the model can include a medical AI model. The method further comprises determining, by the system, subgroup performance measures for subgroups of the data samples respectively associated with different metadata factors, wherein the subgroup performance measures reflect performance accuracy of the machine learning model with respect to the subgroups. The method further comprises determining, by the system, whether the machine learning model meets an acceptable level of performance for deployment in a field environment based on whether the subgroup performance measures respectively satisfy a threshold subgroup performance measure.

    MACHINE LEARNING MODEL DEVELOPMENT AND OPTIMIZATION PROCESS THAT ENSURES PERFORMANCE VALIDATION AND DATA SUFFICIENCY FOR REGULATORY APPROVAL

    公开(公告)号:US20210201190A1

    公开(公告)日:2021-07-01

    申请号:US16728489

    申请日:2019-12-27

    Inventor: Marc T. Edgar

    Abstract: Machine learning model development and optimization tools are provided that ensure performance validation and data sufficiency for regulatory approval. According to an embodiment, a computer implemented method can comprise training a machine learning model to perform an inferencing task on an initial set of data samples included in a sample population. In various embodiments, the model can include a medical AI model. The method further comprises determining, by the system, subgroup performance measures for subgroups of the data samples respectively associated with different metadata factors, wherein the subgroup performance measures reflect performance accuracy of the machine learning model with respect to the subgroups. The method further comprises determining, by the system, whether the machine learning model meets an acceptable level of performance for deployment in a field environment based on whether the subgroup performance measures respectively satisfy a threshold subgroup performance measure.

    Annotation pipeline for machine learning algorithm training and optimization

    公开(公告)号:US11475358B2

    公开(公告)日:2022-10-18

    申请号:US16527965

    申请日:2019-07-31

    Abstract: Techniques are provided for enhancing the efficiency and accuracy of annotating data samples for supervised machine learning algorithms using an advanced annotation pipeline. According to an embodiment, a method can comprise collecting, by a system comprising a processor, unannotated data samples for input to a machine learning model and storing the unannotated data samples in an annotation queue. The method further comprises determining, by the system, annotation priority levels for respective unannotated data samples of the unannotated data samples, selecting, by the system from amongst different annotation techniques, one or more of the different annotation techniques for annotating the respective unannotated data samples based the annotation priority levels associated with the respective unannotated data samples.

    ANNOTATION PIPELINE FOR MACHINE LEARNING ALGORITHM TRAINING AND OPTIMIZATION

    公开(公告)号:US20210034920A1

    公开(公告)日:2021-02-04

    申请号:US16527965

    申请日:2019-07-31

    Abstract: Techniques are provided for enhancing the efficiency and accuracy of annotating data samples for supervised machine learning algorithms using an advanced annotation pipeline. According to an embodiment, a method can comprise collecting, by a system comprising a processor, unannotated data samples for input to a machine learning model and storing the unannotated data samples in an annotation queue. The method further comprises determining, by the system, annotation priority levels for respective unannotated data samples of the unannotated data samples, selecting, by the system from amongst different annotation techniques, one or more of the different annotation techniques for annotating the respective unannotated data samples based the annotation priority levels associated with the respective unannotated data samples.

Patent Agency Ranking