-
公开(公告)号:US20240303774A1
公开(公告)日:2024-09-12
申请号:US18020918
申请日:2022-06-10
Inventor: Changyong SHU , Jiaming LIU , Zhibin HONG , Junyu HAN
CPC classification number: G06T5/50 , G06T5/60 , G06T7/40 , G06T7/55 , G06T7/90 , G06V40/172 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084 , G06T2207/20221 , G06T2207/30201
Abstract: A method of processing an image, an electronic device and a storage medium. The method includes: generating a to-be-processed image according to a first target image and a second target image, where an identity information of an object in the to-be-processed image is matched with an identity information of an object in the first target image; generating a set of disentangled images according to the second target image and the to-be-processed image, where the set of disentangled images includes a head-disentangled image and a disentangled repair image; and generating a fusion image according to the set of disentangled images, where an identity information and a texture information of an object in the fusion image are matched with the identity information and the texture information of the object in the to-be-processed image, respectively, and a to-be-repaired information related to the object in the fusion image is repaired.
-
公开(公告)号:US20230010031A1
公开(公告)日:2023-01-12
申请号:US17946464
申请日:2022-09-16
Inventor: Pengyuan LYU , Sen FAN , Xiaoyan WANG , Yuechen YU , Chengquan ZHANG , Kun YAO , Junyu HAN
Abstract: A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.
-
公开(公告)号:US20220392205A1
公开(公告)日:2022-12-08
申请号:US17892669
申请日:2022-08-22
Inventor: Yipeng SUN , Rongqiao AN , Xiang WEI , Longchao WANG , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING
Abstract: Embodiments of the present disclosure provide a method and apparatus for training an image recognition model based on a semantic enhancement, a method and apparatus for recognizing an image, an electronic device, and a computer readable storage medium. The method for training an image recognition model based on a semantic enhancement comprises: extracting, from an inputted first image being unannotated and having no textual description, a first feature representation of the first image; calculating a first loss function based on the first feature representation; extracting, from an inputted second image being unannotated and having an original textual description, a second feature representation of the second image; calculating a second loss function based on the second feature representation, and training an image recognition model based on a fusion of the first loss function and the second loss function.
-
公开(公告)号:US20230145443A1
公开(公告)日:2023-05-11
申请号:US17959727
申请日:2022-10-04
Inventor: Tianshu HU , Hanqi GUO , Junyu HAN , Zhibin HONG
IPC: G06T3/40
CPC classification number: G06T3/4038
Abstract: Provided are a video stitching method and an apparatus, an electronic device, and a storage medium. In the video stitching method, an intermediate frame is inserted between a last image frame of a first video and a first image frame of a second video. L image frames are sequentially selected in order from back to front from the first video and L image frames are sequentially selected in order from front to back from the second video separately, and L is a natural number greater than 1. The first video and the second video are stitched together to form a target video according to the intermediate frame, the L image frames in the first video, and the L image frames in the second video.
-
公开(公告)号:US20230045715A1
公开(公告)日:2023-02-09
申请号:US17966112
申请日:2022-10-14
Inventor: Chengquan ZHANG , Pengyuan LV , Sen FAN , Kun YAO , Junyu HAN , Jingtuo LIU
Abstract: The present disclosure provides a text detection method, a text recognition method and an apparatus, which relate to the field of artificial intelligence technology, in particular to the field of deep learning and computer vision technologies, and can be applied to scenarios such as optical character recognition. The text detection method is: acquiring an image feature of a text strip in a to-be-recognized image; performing visual enhancement processing on the to-be-recognized image to obtain an enhanced feature map of the to-be-recognized image; comparing the image feature of the text strip with the enhanced feature map for similarity to obtain a target bounding box of the text strip on the enhanced feature map.
-
公开(公告)号:US20220027611A1
公开(公告)日:2022-01-27
申请号:US17498226
申请日:2021-10-11
Inventor: Yuechen YU , Chengquan ZHANG , Yulin LI , Xiaoqiang ZHANG , Ju HUANG , Xiameng QIN , Kun YAO , Jingtuo LIU , Junyu HAN , Errui DING
Abstract: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying the to-be-classified document image based on the multimodal feature corresponding to each text box.
-
公开(公告)号:US20210406592A1
公开(公告)日:2021-12-30
申请号:US17182987
申请日:2021-02-23
Inventor: Yulin LI , Xiameng QIN , Ju HUANG , Qunyi XIE , Junyu HAN
IPC: G06K9/62 , G06K9/46 , G06F40/279
Abstract: The present disclosure provides a method for visual question answering. The method includes: acquiring an input image and an input question; constructing a visual graph based on the input image, wherein the visual graph comprises a first node feature and a first edge feature; constructing a question graph based on the input question, wherein the question graph comprises a second node feature and a second edge feature; performing a multimodal fusion on the visual graph and the question graph to obtain an updated visual graph and an updated question graph; determining a question feature based on the input question; determining a fusion feature based on the updated visual graph, the updated question graph and the question feature; and generating a predicted answer for the input image and the input question. The present disclosure further provides an apparatus for visual question answering, a computer device and a medium.
-
公开(公告)号:US20240281609A1
公开(公告)日:2024-08-22
申请号:US18041207
申请日:2022-05-16
Inventor: Pengyuan LV , Jingquan LI , Chengquan ZHANG , Kun YAO , Jingtuo LIU , Junyu HAN
Abstract: The present application provides a method of training a text recognition model. The method includes: inputting a first sample image into the visual feature extraction sub-model to obtain a first visual feature and a first predicted text, the first sample image contains a text and a tag indicating a first actual text; obtaining, by using the semantic feature extraction sub-model, a first semantic feature based on the first predicted text; obtaining, by using the sequence sub-model, a second predicted text based on the first visual feature and the first semantic feature; and training the text recognition model based on the first predicted text, the second predicted text and the first actual text. The present disclosure further provides a method of recognizing a text, an electronic device, and a storage medium.
-
公开(公告)号:US20230124389A1
公开(公告)日:2023-04-20
申请号:US17887690
申请日:2022-08-15
Inventor: Longchao WANG , Yipeng SUN , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING
IPC: G06V10/70 , G06V10/774
Abstract: A model determination method and electronic device is provided, and relates to the technical field of artificial intelligence and, in particular, to the field of computer visions and deep learning, and can be applied to image processing, image identification and other scenarios. A specific implementation solution includes an image sample and a text sample are acquired, wherein text data in the text sample is used for performing text description to target image data in the image sample; at least one image feature in the image sample is stored to a first queue, and at least text feature in the text sample is stored to a second queue; the first queue and the second queue are trained to obtain a first target model; and the first target model is determined as an initialization model for a second target model.
-
公开(公告)号:US20220292131A1
公开(公告)日:2022-09-15
申请号:US17826760
申请日:2022-05-27
Inventor: Ruibin BAI , Xiang WEI , Yipeng SUN , Kun YAO , Jingtuo LIU , Junyu HAN
IPC: G06F16/583 , G06V10/74 , G06V10/44 , G06F16/535
Abstract: A method, apparatus and system for retrieving an image is provided, the method comprises: detecting, in response to receiving a query request comprising a target image, a target subject from the target image; extracting a subject feature from the target subject if a confidence level of a detection box of the detected target subject is greater than a first threshold, the subject feature comprising an identical feature, a similar feature and a category; performing matching on the subject feature of the target image and a subject feature of a candidate image pre-stored in a database, to obtain a similarity score and an identicalness score of the candidate image; and selecting, according to the similarity score and the identicalness score, a predetermined number of candidate images as a search result for output.
-
-
-
-
-
-
-
-
-