-
公开(公告)号:US20240220831A1
公开(公告)日:2024-07-04
申请号:US18149248
申请日:2023-01-03
Applicant: Nvidia Corporation
Inventor: J Wyman , Pritish Nahar , Dana Groff
Abstract: Approaches presented herein provide for the management of artificial intelligence (AI)-related resources in a distributed resource environment, such as may be used to support accelerated machine learning (ML) applications on behalf of different users. Management functionality can be provided using an AI manager, such as a management service, that can determine the requirements, capabilities, and limitations of various available AI-related components, such as those of a plurality of AI models, engines, and accelerators, as well as the hardware (e.g., graphics processing units (GPUs)) that run or make up these AI-related resources. An AI manager can determine a selection and configuration of resources that is not only appropriate for use with a specific AI model, but that can also be optimized for factors such as throughput, resource utilization, and inference latency. An AI manager can ensure compatibility of resources and configuration, and can enforce access control to models and data.
-
公开(公告)号:US20220180178A1
公开(公告)日:2022-06-09
申请号:US17115631
申请日:2020-12-08
Applicant: NVIDIA Corporation
Inventor: Penn Tasinga , David B. Yastremsky , Jeremy Wyman , Alvin Ihsani , Pritish Nahar , Piyush Bhatt
Abstract: Apparatuses, systems, and techniques to allocate computing resources to perform inferences. In at least one embodiment, one or more neural networks cause computing resources to be identified based, at least in part, on performance requirements of one or more neural networks to perform inferences.
-