Patent search ap:("INSTITUTE OF AUTOMATION Page CHINESE ACADEMY OF SCIENCES") AND inv:"Bin Liu"

1.

发明授权
Dialogue emotion correction method based on graph neural network 有权

公开(公告)号：US12100418B2

公开(公告)日：2024-09-24

申请号：US17472511

申请日：2021-09-10

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Zheng Lian , Bin Liu , Xuefei Liu

IPC: G10L25/63 , G06F18/25 , G06F40/166 , G06F40/211 , G06F40/216 , G06F40/284 , G06F40/289 , G06F40/30 , G06N20/00 , G06N20/20 , G06V20/40 , G06V40/16 , G10L15/02 , G10L15/26 , G10L25/30

CPC classification number: G10L25/63 , G06F18/253 , G06F40/166 , G06F40/211 , G06F40/216 , G06F40/284 , G06F40/289 , G06F40/30 , G06N20/00 , G06N20/20 , G06V20/41 , G06V40/166 , G06V40/168 , G10L15/02 , G10L15/26 , G10L25/30

Abstract: Disclosed is a dialogue emotion correction method based on a graph neural network, including: extracting acoustic features, text features, and image features from a video file to fuse them into multi-modal features; obtaining an emotion prediction result of each sentence of a dialogue in the video file by using the multi-modal features; fusing the emotion prediction result of each sentence with interaction information between talkers in the video file to obtain interaction information fused emotion features; combining, on the basis of the interaction information fused emotion features, with context-dependence relationship in the dialogue to obtain time-series information fused emotion features; correcting, by using the time-series information fused emotion features, the emotion prediction result of each sentence that is obtained previously as to obtain a more accurate emotion recognition result.

2.

发明授权
Multimodal dimensional emotion recognition method 有权

公开(公告)号：US11281945B1

公开(公告)日：2022-03-22

申请号：US17468994

申请日：2021-09-08

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Licai Sun , Bin Liu , Zheng Lian

IPC: G06K9/00 , G06K9/62 , G10L25/63 , G06F40/279 , G10L25/30 , G06V20/40 , G06V40/16

Abstract: A multimodal dimensional emotion recognition method includes: acquiring a frame-level audio feature, a frame-level video feature, and a frame-level text feature from an audio, a video, and a corresponding text of a sample to be tested; performing temporal contextual modeling on the frame-level audio feature, the frame-level video feature, and the frame-level text feature respectively by using a temporal convolutional network to obtain a contextual audio feature, a contextual video feature, and a contextual text feature; performing weighted fusion on these three features by using a gated attention mechanism to obtain a multimodal feature; splicing the multimodal feature and these three features together to obtain a spliced feature, and then performing further temporal contextual modeling on the spliced feature by using a temporal convolutional network to obtain a contextual spliced feature; and performing regression prediction on the contextual spliced feature to obtain a final dimensional emotion prediction result.

3.

发明授权
Micro-expression recognition method based on multi-scale spatiotemporal feature neural network 有权

公开(公告)号：US11908240B2

公开(公告)日：2024-02-20

申请号：US17471384

申请日：2021-09-10

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Hao Zhang , Bin Liu , Wenxiang She

IPC: G06V40/16 , G06N3/049 , G06F18/214

CPC classification number: G06V40/176 , G06F18/2148 , G06N3/049 , G06V40/168 , G06V40/172

Abstract: Disclosed is a micro-expression recognition method based on a multi-scale spatiotemporal feature neural network, in which spatial features and temporal features of micro-expression are obtained from micro-expression video frames, and combined together to form more robust micro-expression features, at the same time, since the micro-expression occurs in local areas of a face, active local areas of the face during occurrence of the micro-expression and an overall area of the face are combined together for micro-expression recognition.

4.

发明授权
Automatic lie detection method and apparatus for interactive scenarios, device and medium 有权

公开(公告)号：US11238289B1

公开(公告)日：2022-02-01

申请号：US17389364

申请日：2021-07-30

Applicant: Institute of Automation, Chinese Academy of Sciences

Inventor： Jianhua Tao , Zheng Lian , Bin Liu , Licai Sun

IPC: G06K9/00 , G06N3/04 , G06F16/783 , A61B5/16

Abstract: An automatic lie detection method and apparatus for interactive scenarios, a device and a medium to improve the accuracy of automatic lie detection are provided. The method includes: segmenting three modalities, namely a video, an audio and a text, of a to-be-detected sample; extracting short-term features of the three modalities; integrating the short-term features of the three modalities in the to-be-detected sample to obtain long-term features of the three modalities corresponding to each dialogue; integrating the long-term features of the three modalities by a self-attention mechanism to obtain a multi-modal feature of the each dialogue; integrating the multi-modal feature of the each dialogue with interactive information by a graph neutral network to obtain a multi-modal feature integrated with the interactive information; and predicting a lie level of the each dialogue according to the multi-modal feature integrated with the interactive information.

5.

发明授权
Semantic sentiment analysis method fusing in-depth features and time sequence models 有权

公开(公告)号：US11194972B1

公开(公告)日：2021-12-07

申请号：US17464421

申请日：2021-09-01

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Ke Xu , Bin Liu , Yongwei Li

IPC: G06F17/00 , G06F40/30 , G06F40/284 , G06N3/04

Abstract: Disclosed is a semantic sentiment analysis method fusing in-depth features and time sequence models, including: converting a text into a uniformly formatted matrix of word vectors; extracting local semantic emotional text features and contextual semantic emotional text features from the matrix of word vectors; weighting the local semantic emotional text features and the contextual semantic emotional text features by using an attention mechanism to generate fused semantic emotional text features; connecting the local semantic emotional text features, the contextual semantic emotional text features and the fused semantic emotional text features to generate global semantic emotional text features; and performing final text emotional semantic analysis and recognition by using a softmax classifier and taking the global semantic emotional text features as input.

6.

发明授权
Automatic depression detection method and device, and equipment 有权

公开(公告)号：US11266338B1

公开(公告)日：2022-03-08

申请号：US17389381

申请日：2021-07-30

Applicant: Institute of Automation, Chinese Academy of Sciences

Inventor： Jianhua Tao , Mingyue Niu , Bin Liu , Qifei Li

IPC: A61B5/16 , A61B5/00 , G06N20/10

Abstract: An automatic depression detection method includes the following steps of: inputting audio and video files, wherein the audio and video files contain original data in both audio and video modes; conducting segmentation and feature extraction on the audio and video files to obtain a plurality of audio segment horizontal features and video segment horizontal features; combining segment horizontal features into an audio horizontal feature and a video horizontal feature respectively by utilizing a feature evolution pooling objective function; and conducting attentional computation on the segment horizontal features to obtain a video attention audio feature and an audio attention video feature, splicing the audio horizontal feature, the video horizontal feature, the video attention audio feature and the audio attention video feature to form a multimodal spatio-temporal representation, and inputting the multimodal spatio-temporal representation into support vector regression to predict the depression level of individuals in the input audio and video files.

7.

发明授权
Multi-modal lie detection method and apparatus, and device 有权

公开(公告)号：US11244119B1

公开(公告)日：2022-02-08

申请号：US17389383

申请日：2021-07-30

Applicant: Institute of Automation, Chinese Academy of Sciences

Inventor： Jianhua Tao , Licai Sun , Bin Liu , Zheng Lian

IPC: G06F40/35 , G10L15/08 , G10L15/02 , G10L15/24 , G06N3/04 , G06K9/62 , G06K9/00

Abstract: A multi-modal lie detection method and apparatus, and a device to improve an accuracy of an automatic lie detection are provided. The multi-modal lie detection method includes inputting original data of three modalities, namely a to-be-detected audio, a to-be-detected video and a to-be-detected text; performing a feature extraction on input contents to obtain deep features of the three modalities; explicitly depicting first-order, second-order and third-order interactive relationships of the deep features of the three modalities to obtain an integrated multi-modal feature of each word; performing a context modeling on the integrated multi-modal feature of the each word to obtain a final feature of the each word; and pooling the final feature of the each word to obtain global features, and then obtaining a lie classification result by a fully-connected layer.

8.

发明授权
Expression recognition method under natural scene 有权

公开(公告)号：US11216652B1

公开(公告)日：2022-01-04

申请号：US17470135

申请日：2021-09-09

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Mingyuan Xiao , Bin Liu , Zheng Lian

IPC: G06K9/62 , G06K9/00 , G06T5/00 , G06T5/40 , G06K9/46 , G06N3/04

Abstract: An expression recognition method under a natural scene comprises: converting an input video into a video frame sequence in terms of a specified frame rate, and performing facial expression labeling on the video frame sequence to obtain a video frame labeled sequence; removing natural light impact, non-face areas, and head posture impact elimination on facial expression from the video frame labeled sequence to obtain an expression video frame sequence; augmenting the expression video frame sequence to obtain a video preprocessed frame sequence; from the video preprocessed frame sequence, extracting HOG features that characterize facial appearance and shape features, extracting second-order features that describe a face creasing degree, and extracting facial pixel-level deep neural network features by using a deep neural network; then, performing vector fusion on these three obtain facial feature fusion vectors for training; and inputting the facial feature fusion vectors into a support vector machine for expression classification.

9.

发明授权
Automatic depression detection method based on audio-video 有权

公开(公告)号：US11963771B2

公开(公告)日：2024-04-23

申请号：US17472191

申请日：2021-09-10

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Cong Cai , Bin Liu , Mingyue Niu

IPC: A61B5/16 , A61B5/00 , G06F18/25 , G06N3/044 , G06N3/045 , G06N3/048 , G06N3/08 , G06T7/00 , G06V10/80 , G06V20/40 , G10L25/30 , G10L25/57 , G10L25/63 , G10L25/66

CPC classification number: A61B5/165 , A61B5/4803 , A61B5/7275 , G06F18/253 , G06N3/08 , G06T7/0012 , G06V20/46 , G06V20/49 , G10L25/30 , G10L25/57 , G10L25/63 , G10L25/66 , G06T2207/10016

Abstract: Disclosed is an automatic depression detection method using audio-video, including: acquiring original data containing two modalities of long-term audio file and long-term video file from an audio-video file; dividing the long-term audio file into several audio segments, and meanwhile dividing the long-term video file into a plurality of video segments; inputting each audio segment/each video segment into an audio feature extraction network/a video feature extraction network to obtain in-depth audio features/in-depth video features; calculating the in-depth audio features and the in-depth video features by using multi-head attention mechanism so as to obtain attention audio features and attention video features; aggregating the attention audio features and the attention video features into audio-video features; and inputting the audio-video features into a decision network to predict a depression level of an individual in the audio-video file.

10.

发明授权
Physiological signal prediction method 有权

公开(公告)号：US11227161B1

公开(公告)日：2022-01-18

申请号：US17471485

申请日：2021-09-10

Applicant: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventor： Jianhua Tao , Yu He , Bin Liu , Licai Sun

IPC: G06K9/00 , G06K9/46 , G06K9/32 , G06T7/11 , G06N3/08 , A61B5/00 , G06N3/04

Abstract: A physiological signal prediction method includes: collecting a video file, the video file containing long-term videos, and contents of the video file containing data for a face of a single person and true physiological signal data; segmenting a single long-term video into multiple short-term video clips; extracting, by using each frame of image in each of the short-term video clips, features of interested regions for identifying physiological signals so as to form features of interested regions of a single frame; splicing, for each of the short-term video clips, features of interested regions of all fixed frames corresponding to the short-term video clip into features of interested regions of a multi-frame video, and converting the features of the interested regions of the multi-frame video into a spatio-temporal graph; inputting the spatio-temporal graph into a deep learning model for training, and using the trained deep learning model to predict physiological signal parameters.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification