SYSTEM AND METHOD TO INSERT VISUAL SUBTITLES IN VIDEOS
    1.
    发明公开
    SYSTEM AND METHOD TO INSERT VISUAL SUBTITLES IN VIDEOS 审中-公开
    将视频副标题插入视频的系统和方法

    公开(公告)号:EP3226245A1

    公开(公告)日:2017-10-04

    申请号:EP17163723.4

    申请日:2017-03-30

    IPC分类号: G10L25/81 G10L15/26 G10L21/10

    摘要: A system and method to insert visual subtitles in videos is described. The method comprises segmenting an input video signal to extract the speech segments and music segments. Next, a speaker representation is associated for each speech segment corresponding to a speaker visible in the frame. Further, speech segments are analysed to compute the phones and the duration of each phone. The phones are mapped to a corresponding viseme and a viseme based language model is created with a corresponding score. Most relevant viseme is selected for the speech segments by computing a total viseme score. Further, a speaker representation sequence is created such that phones and emotions in the speech segments are represented as reconstructed lip movements and eyebrow movements. The speaker representation sequence is then integrated with the music segments and super imposed on the input video signal to create subtitles.

    摘要翻译: 描述了在视频中插入可视字幕的系统和方法。 该方法包括分割输入视频信号以提取语音片段和音乐片段。 接下来,对应于在该帧中可见的讲话者的每个讲话片段,讲话者表示被关联。 此外,分析语音段以计算电话和每部电话的持续时间。 手机被映射到相应的视角,并且基于视角的语言模型被创建并具有相应的分数。 通过计算总视位分数为语音片段选择最相关的视位。 此外,创建讲话者表示序列,使得讲话片段中的电话和情绪表示为重建的嘴唇移动和眉毛移动。 然后将说话者表示序列与音乐片段集成并且强加在输入视频信号上以创建字幕。

    A SYSTEM AND METHOD FOR IDENTIFYING AND ANALYZING PERSONAL CONTEXT OF A USER
    2.
    发明公开
    A SYSTEM AND METHOD FOR IDENTIFYING AND ANALYZING PERSONAL CONTEXT OF A USER 审中-公开
    系统和方法标识和用户的个人语境分析

    公开(公告)号:EP2810426A2

    公开(公告)日:2014-12-10

    申请号:EP13746227.1

    申请日:2013-01-22

    IPC分类号: H04M3/42

    摘要: A method and system for identifying personal context of auser having a portable mobile communication device at a particular location for deriving social interaction information of the user, wherein the user within a predefined range is identified using personal context of the user at the particular location and the identified personal context of the user is assigned with the confidence value. Further the current location information of the user within the particular location is obtained by fusing assigned confidence value. Further the proximity of the user in the current location is estimated by finding the accurate straight line distance between users. Further the two users having similar current location information at the particular location are grouped together with the predefined density criteria. Finally the social interaction information of the user is derived by multimodal sensor data fusion at the fusion engine and represented using a human network graph.

    摘要翻译: 一种用于鉴定具有便携式移动通信设备在特定的位置,用于导出所述用户的社会互动信息,worin在预定范围内的用户的用户的个人的上下文的方法和系统使用用户的个人上下文识别的特定位置和 标识的用户的个人上下文被指派与置信度值。 此外,所述特定位置内的用户的当前位置信息,通过熔断分配置信度值获得。 此外,在当前位置的用户的接近是通过查找用户之间的准确的直线距离估计。 还具有在所述特定位置相似的当前位置信息的两个用户与预定义密度的标准分组在一起。 最后,用户的社交交互信息由多模式传感器数据融合在融合引擎衍生和使用人的网络图来表示。

    SYSTEMS AND METHODS FOR AUTOMATIC REPAIR OF SPEECH RECOGNITION ENGINE OUTPUT
    5.
    发明公开
    SYSTEMS AND METHODS FOR AUTOMATIC REPAIR OF SPEECH RECOGNITION ENGINE OUTPUT 审中-公开
    用于自动修复语音识别引擎输出的系统和方法

    公开(公告)号:EP3270374A1

    公开(公告)日:2018-01-17

    申请号:EP17181149.0

    申请日:2017-07-13

    摘要: Text output of speech recognition engines tend to be erroneous when spoken data has domain specific terms. The present disclosure facilitates automatic correction of errors in speech to text conversion using abstractions of evolutionary development and artificial development. The words in a speech recognition engine text output are treated as a set of injured genes in a biological cell that need repair which are then repaired and form genotypes that are then repaired to phenotypes through a series of repair steps based on a matching, mapping and linguistic repair through a fitness criteria. A basic genetic level repair involves phonetic MATCHING function together with a FITNESS function to select the best among the matching genes. A second genetic level repair involves a contextual MAPPING function for repairing remaining 'injured' genes of the speech recognition engine output. Finally, a genotype to phenotype repair involves using linguistic rules and semantic rules of the domain.

    摘要翻译: 语音识别引擎的文本输出在口头数据具有领域特定术语时往往是错误的。 本公开利用进化发展和人工发展的抽象来促进语音到文本转换中的错误的自动校正。 语音识别引擎文本输出中的单词被视为需要修复的生物细胞中的一组受损基因,然后修复它们并形成基因型,然后通过一系列修复步骤将基因型修复为表型,所述修复步骤基于匹配,映射和 通过健身标准进行语言修复。 基本的基因水平修复涉及语音匹配功能和FITNESS功能,以在匹配基因中选择最好的。 第二个基因水平修复涉及用于修复语音识别引擎输出的剩余'受伤'基因的上下文映射功能。 最后,表型修复的基因型涉及使用领域的语言规则和语义规则。