-
公开(公告)号:US20230215136A1
公开(公告)日:2023-07-06
申请号:US18113826
申请日:2023-02-24
Inventor: Haoran WANG , Dongliang HE , Fu LI , Errui DING
CPC classification number: G06V10/761 , G06V10/7715
Abstract: The present disclosure provides a method and apparatus for training a multi-modal data matching degree calculation model, a method and apparatus for calculating a multi-modal data matching degree, an electronic device, a computer readable storage medium and a computer program product, and relates to the field of artificial intelligence technology such as deep learning, image processing and computer vision. The method comprises: acquiring first sample data and second sample data that are different in modalities; constructing a contrastive learning loss function comprising a semantic perplexity parameter, the semantic perplexity parameter being determined based on a semantic feature distance between the first sample data and the second sample data; and training, by using the contrastive learning loss function, an initial multi-modal data matching degree calculation model through a contrastive learning approach, to obtain a target multi-modal data matching degree calculation model.
-
公开(公告)号:US20230101704A1
公开(公告)日:2023-03-30
申请号:US17898704
申请日:2022-08-30
Inventor: Ruifeng DENG , Tianwei LIN , Fu LI
Abstract: The present disclosure discloses a video generation method and apparatus, an electronic device and a readable storage medium, and relates to the field of artificial intelligence, and in particular, to computer vision and deep learning technologies, which may specifically be used in 3D visual scenarios. A specific implementation scheme involves: determining a reference portrait in an original image; performing posture change processing on the reference portrait in the original image by using a nonlinear function, to obtain at least one change image; and generating a dynamic video of the reference portrait according to the original image and the at least one change image.
-
3.
公开(公告)号:US20220327757A1
公开(公告)日:2022-10-13
申请号:US17849225
申请日:2022-06-24
Inventor: Ruifeng DENG , Tianwei LIN , Fu LI
Abstract: An apparatus, an electronic device, and a storage medium may implement a method for generating a dynamic video of a character. The method includes: identifying a character contour area from a first picture containing a character image; acquiring a plurality of sampling points in the first picture based on the character contour area, and dividing the first picture into a plurality of triangles by each of the sampling points; deforming at least a portion of the plurality of triangles in the first picture to obtain a second picture; and acquiring at least one intermediate picture between the first picture and the second picture, and generating the dynamic video of the character comprising the first picture, the second picture and the at least one intermediate picture.
-
公开(公告)号:US20220027661A1
公开(公告)日:2022-01-27
申请号:US17479872
申请日:2021-09-20
Inventor: Ruifeng DENG , Tianwei LIN , Xin LI , Fu LI
Abstract: There is provided a method and an apparatus of processing image, an electronic device, and a storage medium, which relates to a field of artificial intelligence technology, and specifically relates to a computer vision and deep learning technology applied to an image acquisition scene. The method includes performing a saliency detection on an original image to obtain a saliency map of the original image; performing a semantic segmentation on the original image to obtain a semantic segmentation map of the original image; modifying the saliency map by using the semantic segmentation map, so as to obtain a target map containing a target object; and cropping the original image based on a position of the target object in the target map.
-
5.
公开(公告)号:US20240013558A1
公开(公告)日:2024-01-11
申请号:US18113266
申请日:2023-02-23
Inventor: Haoran WANG , Dongliang HE , Fu LI , Errui DING
IPC: G06V20/70 , G06V10/774 , G06V20/40 , G06F40/30 , G06F40/279
CPC classification number: G06V20/70 , G06V10/774 , G06V20/46 , G06F40/30 , G06F40/279
Abstract: There is provided cross-modal feature extraction, retrieval, and model training methods and apparatuses, and a medium, which relates to the field of artificial intelligence (AI) technologies, and specifically to fields of deep learning, image processing, and computer vision technologies. A specific implementation solution involves: acquiring to-be-processed data, the to-be-processed data corresponding to at least two types of first modalities; determining first data of a second modality in the to-be-processed data, the second modality being any of the types of the first modalities; performing semantic entity extraction on the first data to obtain semantic entities; and acquiring semantic coding features of the first data based on the first data and the semantic entities and by using a pre-trained cross-modal feature extraction model.
-
公开(公告)号:US20230047748A1
公开(公告)日:2023-02-16
申请号:US17974073
申请日:2022-10-26
Inventor: Fu LI , Tianwei LIN
IPC: G06V10/774 , G06T5/50 , G06V10/82
Abstract: A method of fusing an image, a method of training an image fusion model, an electronic device, and a storage medium. The method of fusing the image includes: encoding a stitched image obtained by stitching a foreground image and a background image, so as to obtain a feature map; and decoding the feature map to obtain a fused image, wherein the feature map is decoded by: performing a weighting on the feature map by using an attention mechanism, so as to obtain a weighted feature map; performing a fusion on the feature map according to feature statistical data of the weighted feature map, so as to obtain a fused feature; and decoding the fused feature to obtain the fused image.
-
-
-
-
-