Method and apparatus for predicting events in video conferencing and other applications
    1.
    发明申请
    Method and apparatus for predicting events in video conferencing and other applications 失效
    用于预测视频会议和其他应用中的事件的方法和装置

    公开(公告)号:US20020101505A1

    公开(公告)日:2002-08-01

    申请号:US09730204

    申请日:2000-12-05

    IPC分类号: H04N007/14

    CPC分类号: H04N7/15

    摘要: Methods and apparatus are disclosed for predicting events using acoustic and visual cues. The present invention processes audio and video information to identify one or more (i) acoustic cues, such as intonation patterns, pitch and loudness, (ii) visual cues, such as gaze, facial pose, body postures, hand gestures and facial expressions, or (iii) a combination of the foregoing, that are typically associated with an event, such as behavior exhibited by a video conference participant before he or she speaks. In this manner, the present invention allows the video processing system to predict events, such as the identity of the next speaker. The predictive speaker identifier operates in a learning mode to learn the characteristic profile of each participant in terms of the concept that the participant nullwill speaknull or nullwill not speaknull under the presence or absence of one or more predefined visual or acoustic cues. The predictive speaker identifier operates in a predictive mode to compare the learned characteristics embodied in the characteristic profile to the audio and video information and thereby predict the next speaker.

    摘要翻译: 公开了用于使用声学和视觉线索预测事件的方法和装置。 本发明处理音频和视频信息以识别一个或多个(i)声音提示,例如语调模式,音调和响度,(ii)视觉提示,例如注视,面部姿势,身体姿势,手势和面部表情, 或(iii)上述的组合,通常与事件相关联,例如视频会议参与者在他或她说话之前展示的行为。 以这种方式,本发明允许视频处理系统预测诸如下一个扬声器的身份的事件。 根据在一个或多个预定义的视觉或声音提示的存在或不存在的参与者将“会说”或“不会说”的概念,学习模式中的预测性说话者标识符以学习模式操作来学习每个参与者的特征曲线。 预测扬声器标识符以预测模式操作,以将特征曲线中体现的学习特征与音频和视频信息进行比较,从而预测下一个说话者。