-
公开(公告)号:US20220005244A1
公开(公告)日:2022-01-06
申请号:US17479056
申请日:2021-09-20
Inventor: Zhizhi GUO , Borong LIANG , Zhibin HONG , Junyu HAN
Abstract: The present disclosure relates to a field of artificial intelligence technology, in particular to a field of computer vision and deep learning technology, and more particularly, a method and an apparatus for changing a hairstyle of a character, a device, and a storage medium are provided. The method includes: determining an original feature vector of an original image containing the character, wherein the character in the original image has an original hairstyle; acquiring a boundary vector associated with the original hairstyle and a target hairstyle based on a hairstyle classification model; determining a target feature vector corresponding to the target hairstyle based on the original feature vector and the boundary vector; and generating a target image containing the character based on the target feature vector, wherein the character in the target image has the target hairstyle.
-
公开(公告)号:US20210192725A1
公开(公告)日:2021-06-24
申请号:US17021114
申请日:2020-09-15
Inventor: Zhizhi GUO , Yipeng SUN , Jingtuo LIU , Junyu HAN , Duo YANG , Yue DANG , Huichao WANG
IPC: G06T7/00
Abstract: The present disclosure discloses a method, apparatus and electronic device for determining skin smoothness, which relates to the field of computer vision technologies. The specific implementation solution is as follows: when the skin smoothness is calculated, an image to be detected including a face area is obtained first, and then the image to be detected and a smoothness analysis mask image corresponding to the image to be detected are inputted into a deep learning model to obtain a plurality of feature vectors for indicating the skin smoothness of the face. Because the smoothness analysis mask image does not include preset factors including at least one of five sense organs, reflection and hair, the influence of the preset factors on the skin smoothness is avoided, so that the accuracy for the skin smoothness of the face is ensured to a certain extent.
-
公开(公告)号:US20210192696A1
公开(公告)日:2021-06-24
申请号:US17151783
申请日:2021-01-19
Inventor: Qunyi XIE , Xiameng QIN , Yulin LI , Junyu HAN , Shengxian ZHU
Abstract: Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted. By inputting the distorted document image to be corrected into the correction model, the corrected image corresponding to the distorted document image can be obtained through the correction model, which realizes document image correction end-to-end, improves accuracy of the document image correction, and extends application scenarios of the document image correction.
-
公开(公告)号:US20230386168A1
公开(公告)日:2023-11-30
申请号:US18192393
申请日:2023-03-29
Inventor: Yipeng SUN , Mengjun CHENG , Longchao WANG , Xiongwei ZHU , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING , Jingdong WANG , Haifeng Wang
IPC: G06V10/42 , G06F16/583 , H04N19/176
CPC classification number: G06V10/42 , G06F16/5846 , H04N19/176
Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.
-
公开(公告)号:US20230206667A1
公开(公告)日:2023-06-29
申请号:US18147806
申请日:2022-12-29
Inventor: Pengyuan LV , Liang WU , Shanshan LIU , Meina QIAO , Chengquan ZHANG , Kun YAO , Junyu HAN
CPC classification number: G06V30/19127 , G06V30/16
Abstract: A method for recognizing text includes: obtaining a first feature map of an image; for each target feature unit, performing a feature enhancement process on a plurality of feature values of the target feature unit respectively based on the plurality of feature values of the target feature unit, in which the target feature unit is a feature unit in the first feature map along a feature enhancement direction; and performing a text recognition process on the image based on the first feature map after the feature enhancement process.
-
公开(公告)号:US20220415071A1
公开(公告)日:2022-12-29
申请号:US17899712
申请日:2022-08-31
Inventor: Chengquan ZHANG , Pengyuan LV , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Jingtuo LIU , Junyu HAN , Errui DING , Jingdong WANG
IPC: G06V30/19 , G06V30/18 , G06T9/00 , G06V30/262 , G06N20/00
Abstract: The present disclosure provides a training method of a text recognition model, a text recognition method, and an apparatus, relating to the technical field of artificial intelligence, and specifically, to the technical field of deep learning and computer vision, which can be applied in scenarios such as optional character recognition, etc. The specific implementation solution is: performing mask prediction on visual features of an acquired sample image, to obtain a predicted visual feature; performing mask prediction on semantic features of acquired sample text, to obtain a predicted semantic feature, where the sample image includes text; determining a first loss value of the text of the sample image according to the predicted visual feature; determining a second loss value of the sample text according to the predicted semantic feature; training, according to the first loss value and the second loss value, to obtain the text recognition model.
-
公开(公告)号:US20220383626A1
公开(公告)日:2022-12-01
申请号:US17883248
申请日:2022-08-08
Inventor: Jian WANG , Junyu HAN , Jinwen CHEN , Lufei LIU
IPC: G06V10/77 , G06V10/771 , G06V10/80 , G06V10/50 , G06V10/82
Abstract: An image processing method includes: obtaining a first categorical feature and M first image features corresponding to M first images respectively, each first image being associated with a task index, task indices associated with different first images being different from each other, M being a positive integer; fusing the M first image features with the first categorical feature respectively so as to obtain M first target features; performing feature extraction on the M first target features so as to obtain M second categorical features; selecting a second categorical feature corresponding to each task index from the M second categorical features, and performing regularization corresponding to the task index on the second categorical feature, to obtain a third categorical feature corresponding to the task index; and performing image processing in accordance with M third categorical features so as to obtain M first image processing results of the M first images.
-
公开(公告)号:US20210406468A1
公开(公告)日:2021-12-30
申请号:US17161466
申请日:2021-01-28
Inventor: Xiameng QIN , Yulin LI , Qunyi XIE , Ju HUANG , Junyu HAN
IPC: G06F40/279 , G06N3/08 , G06N3/04 , G06F16/532 , G06F16/583 , G06K9/20 , G06K9/62 , G06K9/46
Abstract: The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.
-
-
-
-
-
-
-