专利检索 ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.") AND inv:"HUANG, Yue" 第 1 页

1.

发明公开
INFERENCE SERVICE DEPLOYMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：EP4280051A1

公开(公告)日：2023-11-22

申请号：EP22204121.2

申请日：2022-10-27

申请人： BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

发明人： YUAN, Zhengxiong , CHU, Zhenfang , LI, Jinqi , HU, Mingren , WANG, Guobin , LUO, Yang , HUANG, Yue , QIAN, Zhengyu , SHI, En

IPC分类号： G06F8/60 , G06F8/71 , G06N3/063

摘要： The present disclosure provides an inference service deployment method and apparatus, a device and a storage medium, relating to the field of artificial intelligence technology, and in particular to the field of machine learning and inference service technology. The specific implementation scheme provides an inference service deployment method, including: obtaining (S101) performance information of a runtime environment of a deployment end; selecting (S 102) a target version of an inference service from a plurality of candidate versions of the inference service of a model according to the performance information of the runtime environment of the deployment end; and deploying (S103) the target version of the inference service to the deployment end. The present disclosure can improve deployment efficiency of the inference service.

2.

发明公开
METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR RUNNING INFERENCE SERVICE PLATFORM 审中-公开

公开(公告)号：EP4060496A3

公开(公告)日：2023-01-04

申请号：EP22188822.5

申请日：2022-08-04

申请人： Beijing Baidu Netcom Science Technology Co., Ltd.

发明人： YUAN, Zhengxiong , QIAN, Zhengyu , SHI, En , HU, Mingren , LI, Jinqi , CHU, Zhenfang , LI, Runqing , HUANG, Yue

IPC分类号： G06F9/50

摘要： A method for running an inference service platform, includes: determining inference tasks to be allocated for the inference service platform, in which the inference service platform includes two or more inference service groups, versions of the inference service groups are different, and the inference service groups are configured to perform a same type of inference services; determining a flow weight of each of the inference service groups, in which the flow weight is configured to indicate a proportion of a number of inference tasks to which the corresponding inference service group need to be allocated in a total number of inference tasks; and allocating the corresponding number of inference tasks in the inference tasks to be allocated to each of the inference service groups based on the flow weight of each of the inference service groups; and performing the inference tasks by the inference service group.

3.

发明公开
METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR RUNNING INFERENCE SERVICE PLATFORM 审中-公开

公开(公告)号：EP4060496A2

公开(公告)日：2022-09-21

申请号：EP22188822.5

申请日：2022-08-04

申请人： Beijing Baidu Netcom Science Technology Co., Ltd.

发明人： YUAN, Zhengxiong , QIAN, Zhengyu , SHI, En , HU, Mingren , LI, Jinqi , CHU, Zhenfang , LI, Runqing , HUANG, Yue

IPC分类号： G06F9/50

摘要： A method for running an inference service platform, includes: determining inference tasks to be allocated for the inference service platform, in which the inference service platform includes two or more inference service groups, versions of the inference service groups are different, and the inference service groups are configured to perform a same type of inference services; determining a flow weight of each of the inference service groups, in which the flow weight is configured to indicate a proportion of a number of inference tasks to which the corresponding inference service group need to be allocated in a total number of inference tasks; and allocating the corresponding number of inference tasks in the inference tasks to be allocated to each of the inference service groups based on the flow weight of each of the inference service groups; and performing the inference tasks by the inference service group.