-
公开(公告)号:EP3226245A1
公开(公告)日:2017-10-04
申请号:EP17163723.4
申请日:2017-03-30
摘要: A system and method to insert visual subtitles in videos is described. The method comprises segmenting an input video signal to extract the speech segments and music segments. Next, a speaker representation is associated for each speech segment corresponding to a speaker visible in the frame. Further, speech segments are analysed to compute the phones and the duration of each phone. The phones are mapped to a corresponding viseme and a viseme based language model is created with a corresponding score. Most relevant viseme is selected for the speech segments by computing a total viseme score. Further, a speaker representation sequence is created such that phones and emotions in the speech segments are represented as reconstructed lip movements and eyebrow movements. The speaker representation sequence is then integrated with the music segments and super imposed on the input video signal to create subtitles.
摘要翻译: 描述了在视频中插入可视字幕的系统和方法。 该方法包括分割输入视频信号以提取语音片段和音乐片段。 接下来,对应于在该帧中可见的讲话者的每个讲话片段,讲话者表示被关联。 此外,分析语音段以计算电话和每部电话的持续时间。 手机被映射到相应的视角,并且基于视角的语言模型被创建并具有相应的分数。 通过计算总视位分数为语音片段选择最相关的视位。 此外,创建讲话者表示序列,使得讲话片段中的电话和情绪表示为重建的嘴唇移动和眉毛移动。 然后将说话者表示序列与音乐片段集成并且强加在输入视频信号上以创建字幕。
-
公开(公告)号:EP3226245B1
公开(公告)日:2020-07-29
申请号:EP17163723.4
申请日:2017-03-30
-