-
公开(公告)号:US20240412438A1
公开(公告)日:2024-12-12
申请号:US18747627
申请日:2024-06-19
Inventor: Xirui FAN , Yafei ZHAO , Zongcai DU , Yi CHEN , Zhiqiang WANG
Abstract: The present disclosure provides a mouth shape-based method for generating a face image, a method for training a model, and a device, which relates to the field of artificial intelligence, in particular to the field of cloud computing and digital human. The specific implementation solution is as follows: acquiring audio data to be recognized and a preset face image; determining an audio feature of the audio data to be recognized; where the audio feature includes a speech speed feature and a semantic feature; and performing, according to the speech speed feature and the semantic feature, processing on the preset face image, to generate a face image having a mouth shape.
-
公开(公告)号:US20230048495A1
公开(公告)日:2023-02-16
申请号:US17974183
申请日:2022-10-26
Inventor: Qunyi XIE , Xiameng QIN , Mengyi EN , Dongdong ZHANG , Ju HUANG , Yangliu XU , Yi CHEN , Kun YAO
IPC: G06V30/413 , G06V10/764 , G06V10/24 , G06V10/75 , G06V30/414
Abstract: A method and a platform of generating a document, an electronic device, and a storage medium are provided, which relate to a field of an artificial intelligence technology, in particular to fields of computer vision and deep learning technologies, and may be applied to a text recognition scenario and other scenarios. The method includes: performing a category recognition on a document picture to obtain a target category result; determining a target structured model matched with the target category result; and performing, by using the target structured model, a structure recognition on the document picture to obtain a structure recognition result, so as to generate an electronic document based on the structure recognition result, wherein the structure recognition result includes a field attribute recognition result and a field position recognition result.
-
公开(公告)号:US20230134615A1
公开(公告)日:2023-05-04
申请号:US18146839
申请日:2022-12-27
Inventor: Qunyi XIE , Dongdong ZHANG , Xiameng QIN , Mengyi EN , Yangliu XU , Yi CHEN , Ju HUANG , Kun YAO
IPC: G06F9/48 , G06F40/205 , G06F9/50
Abstract: A method of processing a task, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence, in particular to fields of deep learning and computer vision, and may be applied to OCR optical character recognition and other scenarios. The method includes: parsing labeled data to be processed according to a task type identification, to obtain task labeled data, a tag information of the task labeled data is matched with the task type identification, and the task labeled data includes first task labeled data and second task labeled data; training a model using the first task labeled data, to obtain candidate models, the model is determined according to the task type identification; and determining a target model from the candidate models according to a performance evaluation result obtained by performing performance evaluation on the plurality of candidate models using the second task labeled data.
-
-