-
公开(公告)号:US12118324B1
公开(公告)日:2024-10-15
申请号:US17710727
申请日:2022-03-31
Applicant: Amazon Technologies, Inc.
Inventor: Li Zhang , Sanjiv Ranjan Das , Yue Zhao , Zhijiang He , Shenghua Yue , Zheng Zhang , Xin Huang , Sheng Zha , Shuai Zheng
IPC: G06F16/35 , G06F16/34 , G06F40/284 , G06F40/40
CPC classification number: G06F40/40 , G06F16/345 , G06F16/355 , G06F40/284
Abstract: Techniques for machine learning (ML) and natural language processing (NLP) are described. One technique enables the creation of a clean training dataset through just a few API calls. Another technique provides an automated process for generating a domain-specific lexicon, which is then used to generate ML training datasets, in a manner that requires little to no human labor. Another technique gathers ML training data from domain-specific public sources, which are more likely than typical public sources to contain focused terminology and to be free from errors, thus resulting in trained ML models that provide more accurate inferences.
-
公开(公告)号:US20240428082A1
公开(公告)日:2024-12-26
申请号:US18491604
申请日:2023-10-20
Applicant: Amazon Technologies, Inc.
Inventor: Zhuang Wang , Zhen Jia , Shuai Zheng , Zhen Zhang , Xinwei Fu , Yida Wang
IPC: G06N3/098
Abstract: A placement plan for training state checkpoints of a machine learning model is generated based at least in part on a number of training servers of a distributed training environment. The plan indicates, with respect to an individual server, one or more other servers at which replicas of training state checkpoints of the individual server are to be stored. During selected periods of one or more training iterations of the model, respective portions of a replica of a training state checkpoint of a first server are transmitted to a second server selected based on the placement plan. After an event causes disruption of the training iterations, one of the checkpoints generated at the first server is retrieved from the second server and used to resume the training iterations.
-