-
公开(公告)号:US20210097431A1
公开(公告)日:2021-04-01
申请号:US16588913
申请日:2019-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Andrea Olgiati , Lakshmi Naarayanan Ramakrishnan , Jeffrey John Geevarghese , Denis Davydenko , Vikas Kumar , Rahul Raghavendra Huilgol , Amol Ashok Lele , Stefano Stefani , Vladimir Zhukov
Abstract: Methods, systems, and computer-readable media for debugging and profiling of machine learning model training are disclosed. A machine learning analysis system receives data associated with training of a machine learning model. The data was collected by a machine learning training cluster. The machine learning analysis system performs analysis of the data associated with the training of the machine learning model. The machine learning analysis system detects one or more conditions associated with the training of the machine learning model based at least in part on the analysis. The machine learning analysis system generates one or more alarms describing the one or more conditions associated with the training of the machine learning model.
-
公开(公告)号:US10831519B2
公开(公告)日:2020-11-10
申请号:US15901751
申请日:2018-02-21
Applicant: Amazon Technologies, Inc.
Inventor: Thomas Albert Faulhaber, Jr. , Gowda Dayananda Anjaneyapura Range , Jeffrey John Geevarghese , Taylor Goodhart , Charles Drummond Swan
Abstract: Techniques for packaging and deploying algorithms utilizing containers for flexible machine learning are described. In some embodiments, users can create or utilize simple containers adhering to a specification of a machine learning service in a provider network, where the containers include code for how a machine learning model is to be trained and/or executed. The machine learning service can automatically train a model and/or host a model using the containers. The containers can use a wide variety of algorithms and use a variety of types of languages, libraries, data types, etc. Users can thus implement machine learning training and/or hosting with extremely minimal knowledge of how the overall training and/or hosting is actually performed.
-
公开(公告)号:US12039415B2
公开(公告)日:2024-07-16
申请号:US16588913
申请日:2019-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Andrea Olgiati , Lakshmi Naarayanan Ramakrishnan , Jeffrey John Geevarghese , Denis Davydenko , Vikas Kumar , Rahul Raghavendra Huilgol , Amol Ashok Lele , Stefano Stefani , Vladimir Zhukov
Abstract: Methods, systems, and computer-readable media for debugging and profiling of machine learning model training are disclosed. A machine learning analysis system receives data associated with training of a machine learning model. The data was collected by a machine learning training cluster. The machine learning analysis system performs analysis of the data associated with the training of the machine learning model. The machine learning analysis system detects one or more conditions associated with the training of the machine learning model based at least in part on the analysis. The machine learning analysis system generates one or more alarms describing the one or more conditions associated with the training of the machine learning model.
-
公开(公告)号:US11550614B2
公开(公告)日:2023-01-10
申请号:US17067285
申请日:2020-10-09
Applicant: Amazon Technologies, Inc.
Inventor: Thomas Albert Faulhaber, Jr. , Gowda Dayananda Anjaneyapura Range , Jeffrey John Geevarghese , Taylor Goodhart , Charles Drummond Swan
Abstract: Techniques for packaging and deploying algorithms utilizing containers for flexible machine learning are described. In some embodiments, users can create or utilize simple containers adhering to a specification of a machine learning service in a provider network, where the containers include code for how a machine learning model is to be trained and/or executed. The machine learning service can automatically train a model and/or host a model using the containers. The containers can use a wide variety of algorithms and use a variety of types of languages, libraries, data types, etc. Users can thus implement machine learning training and/or hosting with extremely minimal knowledge of how the overall training and/or hosting is actually performed.
-
-
-