-
公开(公告)号:US20240420458A1
公开(公告)日:2024-12-19
申请号:US18744418
申请日:2024-06-14
Applicant: Lemon Inc. , BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD. , INSTITUTE OF AUTOMATION CHINESE ACADEMY OF SCIENCES
Inventor: Xiaojie JIN , Sihan CHEN , Jiashi FENG , Xingjian HE , Handong LI , Jing LIU
Abstract: The disclosure provides a cross-modal data processing method and apparatus, a device, a storage medium, and a program product. The method comprises: obtaining first modal data to be processed; obtaining a first modal data feature by performing feature extraction based on the first modal data; and obtaining second modal data based on the first modal data feature and a cross-modal processing model, the first modal data and the second modal data having different modalities, wherein the cross-modal processing model needs to be pre-trained based on a concatenated training sample, and the concatenated training sample comprises a concatenated image sample and a corresponding concatenated text sample.
-
52.
公开(公告)号:US20240320514A1
公开(公告)日:2024-09-26
申请号:US18732399
申请日:2024-06-03
Inventor: Zhenan SUN , Yunlong WANG , Zhengquan LUO , Kunbo ZHANG , Qi LI , Yong HE
IPC: G06N3/098
CPC classification number: G06N3/098
Abstract: Disclosed is a method for updating a node model that resists discrimination propagation in federated learning. The method includes: obtaining a node model corresponding to a data node; calculating a mean value of the distribution of class features and a quantity ratio corresponding to training data of the data node, calculating a distribution weighted aggregation model based on the node model, the mean value of the distribution of class features and the quantity ratio; calculating a regularization term corresponding to the data node based on the node model and the distribution weighted aggregation model; calculating a variance of the distribution of the class features corresponding to the data node, calculating a class balanced complementary term by using a cross-domain feature generator; and updating the node model based on the distribution weighted aggregation model, the regularization term, and the class balanced complementary term.
-
公开(公告)号:US11978470B2
公开(公告)日:2024-05-07
申请号:US17980473
申请日:2022-11-03
Inventor: Jiaming Xu , Jian Cui , Bo Xu
IPC: G10L21/0272 , G10L17/02 , G10L17/04 , G10L17/06 , G10L21/028 , H04S1/00
CPC classification number: G10L21/028 , G10L17/02 , G10L17/04 , G10L17/06 , H04S1/007
Abstract: Disclosed are a target speaker separation system, an electronic device and a storage medium. The system includes: first, performing, jointly unified modeling on a plurality of cues based a masked pre-training strategy, to boost the inference capability of a model for missing cues and enhance the representation accuracy of disturbed cues; and second, constructing a hierarchical cue modulation module. A spatial cue is introduced into a primary cue modulation module for directional enhancement of a speech of a speaker; in an intermediate cue modulation module, the speech of the speaker is enhanced on the basis of temporal coherence of a dynamic cue and an auditory signal component; a steady-state cue is introduced into an advanced cue modulation module for selective filtering; and finally, the supervised learning capability of simulation data and the unsupervised learning effect of real mixed data are sufficiently utilized.
-
公开(公告)号:US11963771B2
公开(公告)日:2024-04-23
申请号:US17472191
申请日:2021-09-10
Inventor: Jianhua Tao , Cong Cai , Bin Liu , Mingyue Niu
IPC: A61B5/16 , A61B5/00 , G06F18/25 , G06N3/044 , G06N3/045 , G06N3/048 , G06N3/08 , G06T7/00 , G06V10/80 , G06V20/40 , G10L25/30 , G10L25/57 , G10L25/63 , G10L25/66
CPC classification number: A61B5/165 , A61B5/4803 , A61B5/7275 , G06F18/253 , G06N3/08 , G06T7/0012 , G06V20/46 , G06V20/49 , G10L25/30 , G10L25/57 , G10L25/63 , G10L25/66 , G06T2207/10016
Abstract: Disclosed is an automatic depression detection method using audio-video, including: acquiring original data containing two modalities of long-term audio file and long-term video file from an audio-video file; dividing the long-term audio file into several audio segments, and meanwhile dividing the long-term video file into a plurality of video segments; inputting each audio segment/each video segment into an audio feature extraction network/a video feature extraction network to obtain in-depth audio features/in-depth video features; calculating the in-depth audio features and the in-depth video features by using multi-head attention mechanism so as to obtain attention audio features and attention video features; aggregating the attention audio features and the attention video features into audio-video features; and inputting the audio-video features into a decision network to predict a depression level of an individual in the audio-video file.
-
55.
公开(公告)号:US11954599B2
公开(公告)日:2024-04-09
申请号:US17347608
申请日:2021-06-15
Inventor: Zhaoxiang Zhang , Tieniu Tan , Chunfeng Song , Wenkai Dong
IPC: G06V40/10 , G06F18/21 , G06F18/214 , G06F18/2415 , G06N3/045 , G06N3/08 , G06N3/084 , G06V20/40
CPC classification number: G06N3/084 , G06F18/2148 , G06F18/2193 , G06F18/2415 , G06N3/045 , G06N3/08 , G06V20/40 , G06V40/103
Abstract: A bi-directional interaction network (BINet)-based person search method, system, and apparatus are provided. The method includes: obtaining, as an input image, a tth frame of image in an input video; and normalizing the input image, and obtaining a search result of a to-be-searched target person by using a pre-trained person search model, where the person search model is constructed based on a residual network, and a new classification layer is added to a classification and regression layer of the residual network to obtain an identity classification probability of the target person. The method improves the accuracy of the person search.
-
公开(公告)号:US11944447B2
公开(公告)日:2024-04-02
申请号:US16961700
申请日:2016-11-03
Inventor: Tianzi Jiang , Xin Zhang , Nianming Zuo , Juanning Si
CPC classification number: A61B5/378 , A61B5/0075 , A61B5/0261 , A61B5/374 , A61B5/7207 , A61B5/7282 , A61B5/7285
Abstract: A neurovascular coupling analytical method based on an electroencephalogram and functional near-infrared spectroscopy includes: S100: acquiring an electroencephalogram signal and a brain hemodynamic signal; S110: extracting an event-related potential signal from the electroencephalogram signal; S120: extracting a time characteristic from the event-related potential signal; S130: extracting a hemodynamic response function from the brain hemodynamic signal; S140: extracting an amplitude characteristic and time characteristics from the hemodynamic response function; and S150: analyzing influence of the time characteristic of the event-related potential signal on the amplitude characteristic and the time characteristics of the hemodynamic response function to obtain a coupling result. The time characteristic of the event-related potential signal is a delay. The amplitude characteristic of the hemodynamic response function is a peak amplitude, and the time characteristics of the hemodynamic response function comprises a rising delay, a peak time, and a full width at half maximum.
-
公开(公告)号:US11926515B2
公开(公告)日:2024-03-12
申请号:US16960301
申请日:2018-12-19
Inventor: Jinguo Liu , Yuanzheng Tian , Zhenxin Li , Hongye Han
CPC classification number: B66F9/125 , B65D19/0097 , B65D2519/00298 , B65D2519/00338 , B65D2519/00815
Abstract: The present invention relates to ground support equipment for aerospace engineering, and particularly relates to an assembly and test operation robot for a space station experimental cabinet. The assembly and test operation robot comprises a mobile lifting platform, a comprehensive monitoring system, a rotating clamping mechanism, a multifunctional adapter and a science experimental cabinetet, wherein the mobile lifting platform is used for regulating the horizontal position and the height position of the science experimental cabinetet to realize assembly and transportation functions of the experimental cabinet; the rotating clamping mechanism is installed on the mobile lifting platform to realize clamping and rotation functions of the science experimental cabinet; the multifunctional adapter is installed on the rotating clamping mechanism to carry the science experimental cabinet; and the comprehensive monitoring system is used to monitor the assembly state of the science experimental cabinet in real time. The present invention realizes integrated operation functions of transportation, flipping, assembly and parking in the ground assembly and test process of the space station experimental cabinet, so as to achieve the purpose of safe, efficient and accurate assembly and test of the space station experimental cabinet.
-
公开(公告)号:US11836619B2
公开(公告)日:2023-12-05
申请号:US17039544
申请日:2020-09-30
Inventor: Bailan Feng , Chunfeng Yao , Kaiqi Huang , Zhang Zhang , Xiaotang Chen , Houjing Huang , Dangwei Li
IPC: G06N3/08 , G06N20/00 , G06T3/40 , G06F18/213 , G06V10/774 , G06V20/52 , G06N3/084 , G06N5/046
CPC classification number: G06N3/08 , G06F18/213 , G06N20/00 , G06T3/4007 , G06V10/774 , G06V20/52 , G06N3/084 , G06N5/046
Abstract: An image processing method, a related device, and a computer storage medium are provided. The method includes: obtaining a feature intensity image corresponding to a training image, where an intensity value of a pixel in the feature intensity image is used to indicate importance of the pixel for recognizing the training image, and resolution of the training image is the same as resolution of the feature intensity image; and occluding, based on the feature intensity image, a to-be-occluded region in the training image by using a preset window, to obtain a new image, where the to-be-occluded region includes a to-be-occluded pixel, and the new image is used to update an image recognition model. According to the embodiments of the present application, a prior-art problem that a model has low accuracy and relatively poor generalization performance because of limited training data can be resolved.
-
公开(公告)号:US11835602B1
公开(公告)日:2023-12-05
申请号:US18144256
申请日:2023-05-08
Inventor: Yang Du , Jie Tian , Zhengyao Peng , Lin Yin , Qian Liang
CPC classification number: G01R33/1276 , G06N3/091 , G06T11/003
Abstract: An MPI reconstruction method, device, and system based on a RecNet model include obtaining a one-dimensional (1D) MPI signal on which imaging reconstruction is to be performed, taking the 1D MPI signal as an input signal, and inputting the input signal and a velocity signal of an FFP corresponding to the input signal into a trained magnetic particle reconstruction model RecNet for image reconstruction to obtain a two-dimensional (2D) MPI image, where the magnetic particle reconstruction model RecNet is constructed based on a domain conversion network and an improved UNet network. The MPI reconstruction method, device, and system obtain a high-quality and clear magnetic particle distribution image without obtaining the system matrix.
-
60.
公开(公告)号:US20230335148A1
公开(公告)日:2023-10-19
申请号:US18026960
申请日:2021-08-24
Inventor: Henghui Lu , Lei Qin , Peng Zhang , Jiaming Xu , Bo Xu
IPC: G10L21/0208 , G06V20/40 , G10L21/055
CPC classification number: G10L21/0208 , G06V20/46 , G10L21/055
Abstract: A speech separation method is provided, and relates to the field of speech. The method includes: obtaining, in a speaking process of a user, audio information including a user speech and video information including a user face; coding the audio information to obtain a mixed acoustic feature; extracting a visual semantic feature of the user from the video information; inputting the mixed acoustic feature and the visual semantic feature into a preset visual speech separation network to obtain an acoustic feature of the user; and decoding the acoustic feature of the user to obtain a speech signal of the user. An electronic device, a chip, and a computer-readable storage medium are provided.
-
-
-
-
-
-
-
-
-