-
公开(公告)号:US20210224526A1
公开(公告)日:2021-07-22
申请号:US17164613
申请日:2021-02-01
Inventor: Mingyuan MAO , Yuan FENG , Ying XIN , Pengcheng YUAN , Bin ZHANG , Shufei LIN , Xiaodi WANG , Shumin HAN , Yingbo XU , Jingwei LIU , Shilei WEN , Hongwu Zhang , Errui DING
Abstract: The present application discloses a method and an apparatus for detecting wearing of a safety helmet, a device and a storage medium. The method for detecting wearing of a safety helmet includes: acquiring a first image collected by a camera device, where the first image includes at least one human body image; determining the at least one human body image and at least one head image in the first image; determining a human body image corresponding to each head image in the at least one human body image according to an area where the at least one human body image is located and an area where the at least one head image is located; and processing the human body image corresponding to the at least one head image according to a type of the at least one head image.
-
12.
公开(公告)号:US20240221346A1
公开(公告)日:2024-07-04
申请号:US17800880
申请日:2022-01-29
Inventor: Zhigang WANG , Jian WANG , Hao SUN , Errui DING
IPC: G06V10/44 , G06T9/00 , G06V10/74 , G06V10/762 , G06V10/80
CPC classification number: G06V10/44 , G06T9/00 , G06V10/761 , G06V10/762 , G06V10/806
Abstract: The present disclosure provides a model training method and apparatus, a pedestrian re-identification method and apparatus, and an electronic device, and relates to the field of artificial intelligence, and specifically to computer vision and deep learning technologies, which can be applied to smart city scenarios. A specific implementation solution is: performing, by using a first encoder, feature extraction on a first pedestrian image and a second pedestrian image in a sample dataset, to obtain an image feature of the first pedestrian image and an image feature of the second pedestrian image; fusing the image feature of the first pedestrian image and the image feature of the second pedestrian image, to obtain a fused feature; performing, by using a first decoder, feature decoding on the fused feature, to obtain a third pedestrian image; and determining the third pedestrian image as a negative sample image of the first pedestrian image, and using the first pedestrian image and the negative sample image to train a first preset model to convergence, to obtain a pedestrian re-identification model. The embodiments of the present disclosure can improve the effect of the model in distinguishing between pedestrians with similar appearances but different identities.
-
13.
公开(公告)号:US20240013558A1
公开(公告)日:2024-01-11
申请号:US18113266
申请日:2023-02-23
Inventor: Haoran WANG , Dongliang HE , Fu LI , Errui DING
IPC: G06V20/70 , G06V10/774 , G06V20/40 , G06F40/30 , G06F40/279
CPC classification number: G06V20/70 , G06V10/774 , G06V20/46 , G06F40/30 , G06F40/279
Abstract: There is provided cross-modal feature extraction, retrieval, and model training methods and apparatuses, and a medium, which relates to the field of artificial intelligence (AI) technologies, and specifically to fields of deep learning, image processing, and computer vision technologies. A specific implementation solution involves: acquiring to-be-processed data, the to-be-processed data corresponding to at least two types of first modalities; determining first data of a second modality in the to-be-processed data, the second modality being any of the types of the first modalities; performing semantic entity extraction on the first data to obtain semantic entities; and acquiring semantic coding features of the first data based on the first data and the semantic entities and by using a pre-trained cross-modal feature extraction model.
-
公开(公告)号:US20230419610A1
公开(公告)日:2023-12-28
申请号:US18185359
申请日:2023-03-16
Inventor: Xing LIU , Ruizhi CHEN , Yan ZHANG , Chen ZHAO , Hao SUN , Jingtuo LIU , Errui DING , Tian WU , Haifeng WANG
CPC classification number: G06T17/20 , G06T5/50 , G06V10/26 , G06V10/60 , G06T2207/10028 , G06T2207/20221
Abstract: An image rendering method includes the steps below. A model of an environmental object is rendered to obtain an image of the environmental object in a target perspective. An image of a target object in the target perspective and a model of the target object are determined according to a neural radiance field of the target object. The image of the target object is fused and rendered into the image of the environmental object according to the model of the target object.
-
15.
公开(公告)号:US20230289402A1
公开(公告)日:2023-09-14
申请号:US18055393
申请日:2022-11-14
Inventor: Jian WANG , Xiangbo SU , Qiman WU , Zhigang WANG , Hao SUN , Errui DING , Jingdong WANG , Tian WU , Haifeng WANG
IPC: G06K9/62
CPC classification number: G06K9/62 , G06K9/6288
Abstract: Provided are a joint perception model training method, a joint perception method, a device, and a storage medium. The joint perception model training method includes: acquiring sample images and perception tags of the sample images; acquiring a preset joint perception model, where the joint perception model includes a feature extraction network and a joint perception network; performing feature extraction on the sample images through the feature extraction network to obtain target sample features; performing joint perception through the joint perception network according to the target sample features to obtain perception prediction results; and training the preset joint perception model according to the perception prediction results and the perception tags, where the joint perception includes executing at least two perception tasks.
-
公开(公告)号:US20230124389A1
公开(公告)日:2023-04-20
申请号:US17887690
申请日:2022-08-15
Inventor: Longchao WANG , Yipeng SUN , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING
IPC: G06V10/70 , G06V10/774
Abstract: A model determination method and electronic device is provided, and relates to the technical field of artificial intelligence and, in particular, to the field of computer visions and deep learning, and can be applied to image processing, image identification and other scenarios. A specific implementation solution includes an image sample and a text sample are acquired, wherein text data in the text sample is used for performing text description to target image data in the image sample; at least one image feature in the image sample is stored to a first queue, and at least text feature in the text sample is stored to a second queue; the first queue and the second queue are trained to obtain a first target model; and the first target model is determined as an initialization model for a second target model.
-
公开(公告)号:US20230386168A1
公开(公告)日:2023-11-30
申请号:US18192393
申请日:2023-03-29
Inventor: Yipeng SUN , Mengjun CHENG , Longchao WANG , Xiongwei ZHU , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING , Jingdong WANG , Haifeng Wang
IPC: G06V10/42 , G06F16/583 , H04N19/176
CPC classification number: G06V10/42 , G06F16/5846 , H04N19/176
Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.
-
18.
公开(公告)号:US20230215203A1
公开(公告)日:2023-07-06
申请号:US18168759
申请日:2023-02-14
Inventor: Pengyuan LV , Chengquan ZHANG , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Xiaoyan WANG , Kun YAO , Junyu Han , Errui DING , Jingdong WANG , Tian WU , Haifeng WANG
IPC: G06V30/19
CPC classification number: G06V30/19147 , G06V30/19167
Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.
-
公开(公告)号:US20230215136A1
公开(公告)日:2023-07-06
申请号:US18113826
申请日:2023-02-24
Inventor: Haoran WANG , Dongliang HE , Fu LI , Errui DING
CPC classification number: G06V10/761 , G06V10/7715
Abstract: The present disclosure provides a method and apparatus for training a multi-modal data matching degree calculation model, a method and apparatus for calculating a multi-modal data matching degree, an electronic device, a computer readable storage medium and a computer program product, and relates to the field of artificial intelligence technology such as deep learning, image processing and computer vision. The method comprises: acquiring first sample data and second sample data that are different in modalities; constructing a contrastive learning loss function comprising a semantic perplexity parameter, the semantic perplexity parameter being determined based on a semantic feature distance between the first sample data and the second sample data; and training, by using the contrastive learning loss function, an initial multi-modal data matching degree calculation model through a contrastive learning approach, to obtain a target multi-modal data matching degree calculation model.
-
公开(公告)号:US20220415071A1
公开(公告)日:2022-12-29
申请号:US17899712
申请日:2022-08-31
Inventor: Chengquan ZHANG , Pengyuan LV , Shanshan LIU , Meina QIAO , Yangliu XU , Liang WU , Jingtuo LIU , Junyu HAN , Errui DING , Jingdong WANG
IPC: G06V30/19 , G06V30/18 , G06T9/00 , G06V30/262 , G06N20/00
Abstract: The present disclosure provides a training method of a text recognition model, a text recognition method, and an apparatus, relating to the technical field of artificial intelligence, and specifically, to the technical field of deep learning and computer vision, which can be applied in scenarios such as optional character recognition, etc. The specific implementation solution is: performing mask prediction on visual features of an acquired sample image, to obtain a predicted visual feature; performing mask prediction on semantic features of acquired sample text, to obtain a predicted semantic feature, where the sample image includes text; determining a first loss value of the text of the sample image according to the predicted visual feature; determining a second loss value of the sample text according to the predicted semantic feature; training, according to the first loss value and the second loss value, to obtain the text recognition model.
-
-
-
-
-
-
-
-
-