Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Yang QIAN"

1.

发明申请
METHOD OF PROVIDING MODEL SERVICES 有权

公开(公告)号：US20240419991A1

公开(公告)日：2024-12-19

申请号：US18747725

申请日：2024-06-19

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Zhenfang CHU , Zhengyu QIAN , En SHI , Mingren HU , Zhengxiong YUAN , Jinqi LI , Yue HUANG , Yang LUO , Guobin WANG , Yang QIAN , Kuan WANG

IPC: G06N5/04

Abstract: A method is provided that includes: creating a plurality of first model instances of a first service model to be deployed; allocating an inference service for each of a plurality of first model instances from the plurality of inference services; calling, for each first model instance, a loading interface of the inference service allocated for the first model instance to mount a weight file; determining, in response to a user request for a target service model, a target model instance from a plurality of model instances of the target service model to respond to the user request; and calling a target inference service allocated for the target model instance to use computing resources configured for the target inference service to run, in the target model instance, a base model mounted with a target weight file, and obtain a request result of the user request.

Patent Agency Ranking