-
公开(公告)号:US20210019531A1
公开(公告)日:2021-01-21
申请号:US16830895
申请日:2020-03-26
Inventor: Xiang Long , Dongliang He , Fu Li , Zhizhen Chi , Zhichao Zhou , Xiang Zhao , Ping Wang , Hao Sun , Shilei Wen , Errui Ding
Abstract: a method and an apparatus for classifying a video are provided. The method may include: acquiring a to-be-classified video; extracting a set of multimodal features of the to-be-classified video; inputting the set of multimodal features into a post-fusion model corresponding to each modal respectively, to obtain multimodal category information of the to-be-classified video; and fusing the multimodal category information of the to-be-classified video, to obtain category information of the to-be-classified video. This embodiment improves the accuracy of video classification.
-
12.
公开(公告)号:US10861133B1
公开(公告)日:2020-12-08
申请号:US16810986
申请日:2020-03-06
Inventor: Chao Li , Dongliang He , Xiao Liu , Yukang Ding , Shilei Wen , Errui Ding , Henan Zhang , Hao Sun
IPC: G06T3/40
Abstract: A super-resolution video reconstruction method, device, apparatus and a computer-readable storage medium are provided. The method includes: extracting a hypergraph from consecutive frames of an original video; inputting a hypergraph vector of the hypergraph into a residual convolutional neural network to obtain an output result of the residual convolutional neural network; and inputting the output result of the residual convolutional neural network into a spatial upsampling network to obtain a super-resolution frame, wherein a super-resolution video of the original video is formed by multiple super-resolution frames.
-
公开(公告)号:US11776155B2
公开(公告)日:2023-10-03
申请号:US16894123
申请日:2020-06-05
Inventor: Xiaoqing Ye , Xiao Tan , Wei Zhang , Hao Sun , Errui Ding
IPC: G06T7/73 , G06N3/04 , G06N3/08 , G06T7/70 , G06T11/20 , G06V10/764 , G06V10/82 , G06V20/58 , G06V20/64 , G06V10/25 , G06F18/24 , G06F18/214
CPC classification number: G06T7/73 , G06F18/214 , G06F18/24 , G06N3/04 , G06N3/08 , G06T7/70 , G06T11/20 , G06V10/764 , G06V10/82 , G06V20/58 , G06V20/647 , G06T2207/20081 , G06T2207/20084 , G06T2210/12
Abstract: Embodiments of the present disclosure provide a method and apparatus for detecting a target object in an image. The method includes: performing following prediction operations using a pre-trained neural network: detecting a target object in a two-dimensional image to determine a two-dimensional bounding box of the target object; and determining a relative position constraint relationship between the two-dimensional bounding box of the target object and a three-dimensional projection bounding box obtained by projecting a three-dimensional bounding box of the target object into the two-dimensional image; and the method further including: determining the three-dimensional projection bounding box of the target object, based on the two-dimensional bounding box of the target object and the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box.
-
14.
公开(公告)号:US11538286B2
公开(公告)日:2022-12-27
申请号:US16710464
申请日:2019-12-11
Inventor: Wei Zhang , Xiao Tan , Hao Sun , Shilei Wen , Errui Ding
Abstract: A method and apparatus for vehicle damage assessment, an electronic device, and a computer-readable storage medium are provided. The method may include: extracting, from an input image, a first feature characterizing a part of a vehicle and a second feature characterizing a damage type of the vehicle; integrating the first feature and the second feature to generate a third feature characterizing a corresponding relation between the part and the damage type; converting the third feature into a characteristic vector; and determining a damage recognition result based on the characteristic vector. According to the technical solution of the disclosure, users can rapidly and accurately learn about the damage condition of the vehicle by providing pictures or videos of the damaged vehicle, thus providing an objective basis for subsequent damage assessment, claim settlement, and repair.
-
公开(公告)号:US11482023B2
公开(公告)日:2022-10-25
申请号:US16710528
申请日:2019-12-11
Inventor: Chengquan Zhang , Zuming Huang , Mengyi En , Junyu Han , Errui Ding
IPC: G06V30/262 , G06N20/00 , G06V10/22 , G06V30/148 , G06V30/10
Abstract: A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.
-
公开(公告)号:US11379696B2
公开(公告)日:2022-07-05
申请号:US16817419
申请日:2020-03-12
Inventor: Zhigang Wang , Jian Wang , Shilei Wen , Errui Ding , Hao Sun
Abstract: The present disclosure provides a pedestrian re-identification method and apparatus, computer device and readable medium. The method comprises: collecting a target image and a to-be-identified image including a pedestrian image; obtaining a feature expression of the target image and a feature expression of the to-be-identified image respectively, based on a pre-trained feature extraction model; wherein the feature extraction model is obtained by training based on a self-attention feature of a base image as well as a co-attention feature of the base image relative to a reference image; identifying whether a pedestrian in the to-be-identified image is the same pedestrian as that in the target image according to the feature expression of the target image and the feature expression of the to-be-identified image. According to the pedestrian re-identification method of the present disclosure, the accuracy of the pedestrian re-identification can be effectively improved when the feature extraction model is used to perform the pedestrian re-identification.
-
17.
公开(公告)号:US20210201182A1
公开(公告)日:2021-07-01
申请号:US17200448
申请日:2021-03-12
Inventor: Yulin LI , Xiameng Qin , Chengquan Zhang , Junyu Han , Errui Ding , Tian Wu , Haifeng Wang
IPC: G06N5/04 , G06N3/04 , G06F16/901
Abstract: Embodiments of the present disclosure provide a method and apparatus for performing a structured extraction on a text, a device and a storage medium. The method may include: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line.
-
公开(公告)号:US11908219B2
公开(公告)日:2024-02-20
申请号:US17244291
申请日:2021-04-29
Inventor: Zihan Ni , Yipeng Sun , Kun Yao , Junyu Han , Errui Ding , Jingtuo Liu , Haifeng Wang
IPC: G06V30/413 , G06F40/30 , G06V30/414 , G06V10/70
CPC classification number: G06V30/413 , G06F40/30 , G06V10/70 , G06V30/414
Abstract: The disclosure provides a method and a device for processing information, an electronic device, and a storage medium, belonging to a field of artificial intelligence including computer vision, deep learning, and natural language processing. In the method, the computing device recognizes multiple text items in the image. The computing device classifies multiple text items into a first set of name text items and a second set of content text items based on semantics of the text items. The computing device performs a matching operation between the first set and the second set based on a layout of the text items in the image, and determines matched name-content text items. The matched name-content text items include a name text item in the first set and a content text item matching the name text item and in the second set. The computing device outputs the matched name-content text items.
-
19.
公开(公告)号:US11615140B2
公开(公告)日:2023-03-28
申请号:US17144523
申请日:2021-01-08
Inventor: Xiang Long , Dongliang He , Fu Li , Xiang Zhao , Tianwei Lin , Hao Sun , Shilei Wen , Errui Ding
IPC: G06F16/738 , G06V20/40 , G06F18/214 , G06F18/25
Abstract: A method includes screening, by a video-clip screening module in a video description model, a plurality of video proposal clips acquired from a video to be analyzed, to acquire a plurality of video clips suitable for description. The plural video proposal clips acquired from the video to be analyzed may be screened by the video-clip screening module to acquire the plural video clips suitable for description; and then, each video clip is described by a video-clip describing module, thus avoiding description of all the video proposal clips, only describing the screened video clips which have strong correlation with the video and are suitable for description, removing the interference of the description of the video clips which are not suitable for description in the description of the video, guaranteeing the accuracy of the final descriptions of the video clips, and improving the quality of the descriptions of the video clips.
-
公开(公告)号:US20210271870A1
公开(公告)日:2021-09-02
申请号:US17244291
申请日:2021-04-29
Inventor: Zihan Ni , Yipeng Sun , Kun Yao , Junyu Han , Errui Ding , Jingtuo Liu , Haifeng Wang
Abstract: The disclosure provides a method and a device for processing information, an electronic device, and a storage medium, belonging to a field of artificial intelligence including computer vision, deep learning, and natural language processing. In the method, the computing device recognizes multiple text items in the image. The computing device classifies multiple text items into a first set of name text items and a second set of content text items based on semantics of the text items. The computing device performs a matching operation between the first set and the second set based on a layout of the text items in the image, and determines matched name-content text items. The matched name-content text items include a name text item in the first set and a content text item matching the name text item and in the second set. The computing device outputs the matched name-content text items.
-
-
-
-
-
-
-
-
-