-
公开(公告)号:US20240220831A1
公开(公告)日:2024-07-04
申请号:US18149248
申请日:2023-01-03
Applicant: Nvidia Corporation
Inventor: J Wyman , Pritish Nahar , Dana Groff
Abstract: Approaches presented herein provide for the management of artificial intelligence (AI)-related resources in a distributed resource environment, such as may be used to support accelerated machine learning (ML) applications on behalf of different users. Management functionality can be provided using an AI manager, such as a management service, that can determine the requirements, capabilities, and limitations of various available AI-related components, such as those of a plurality of AI models, engines, and accelerators, as well as the hardware (e.g., graphics processing units (GPUs)) that run or make up these AI-related resources. An AI manager can determine a selection and configuration of resources that is not only appropriate for use with a specific AI model, but that can also be optimized for factors such as throughput, resource utilization, and inference latency. An AI manager can ensure compatibility of resources and configuration, and can enforce access control to models and data.