-
公开(公告)号:US09672829B2
公开(公告)日:2017-06-06
申请号:US14665592
申请日:2015-03-23
发明人: Ye Q. Chen , Wen J. Nie , Ting Wu , Zhao Yang
IPC分类号: G10L15/26 , G10L17/02 , H04N7/15 , G10L25/57 , H04L12/18 , G10L25/87 , H04N7/14 , G10L21/10 , G10L17/00
CPC分类号: G10L17/02 , G10L15/26 , G10L17/00 , G10L21/10 , G10L25/57 , G10L25/87 , H04L12/1827 , H04L12/1831 , H04N7/147 , H04N7/15
摘要: Embodiments of the present invention disclose a method, system, and computer program product for speech summarization. A computer receives audio and video components from a video conference. The computer determines which participant is speaking based on comparing images of the participants with template images of speaking and non-speaking faces. The computer determines the voiceprint of the speaking participant by applying a Hidden Markov Model to a brief recording of the voice waveform of the participant and associates the determined voiceprint with the face of the speaking participant. The computer recognizes and transcribes the content of statements made by the speaker, determines the key points, and displays them over the face of the participant in the video conference.
-
公开(公告)号:US20160284354A1
公开(公告)日:2016-09-29
申请号:US14665592
申请日:2015-03-23
发明人: Ye Q. Chen , Wen J. Nie , Ting Wu , Zhao Yang
CPC分类号: G10L17/02 , G10L15/26 , G10L17/00 , G10L21/10 , G10L25/57 , G10L25/87 , H04L12/1827 , H04L12/1831 , H04N7/147 , H04N7/15
摘要: Embodiments of the present invention disclose a method, system, and computer program product for speech summarization. A computer receives audio and video components from a video conference. The computer determines which participant is speaking based on comparing images of the participants with template images of speaking and non-speaking faces. The computer determines the voiceprint of the speaking participant by applying a Hidden Markov Model to a brief recording of the voice waveform of the participant and associates the determined voiceprint with the face of the speaking participant. The computer recognizes and transcribes the content of statements made by the speaker, determines the key points, and displays them over the face of the participant in the video conference.
摘要翻译: 本发明的实施例公开了一种用于语音摘要的方法,系统和计算机程序产品。 计算机从视频会议接收音频和视频组件。 计算机根据比较参与者的图像和讲话面孔的模板图像来确定哪个参与者在说话。 计算机通过将隐马尔可夫模型应用于参与者的语音波形的简短记录来确定说话参与者的声纹,并将确定的声纹与说话参与者的面相关联。 计算机识别和转录扬声器发出的声明内容,确定关键点,并将其显示在视频会议中的与会者的脸上。
-