Entity based temporal segmentation of video streams

    公开(公告)号:US09607224B2

    公开(公告)日:2017-03-28

    申请号:US14712071

    申请日:2015-05-14

    Applicant: Google Inc.

    CPC classification number: G06K9/00765 G06K9/6269 G06K9/66 H04N5/91

    Abstract: A solution is provided for temporally segmenting a video based on analysis of entities identified in the video frames of the video. The video is decoded into multiple video frames and multiple video frames are selected for annotation. The annotation process identifies entities present in a sample video frame and each identified entity has a timestamp and confidence score indicating the likelihood that the entity is accurately identified. For each identified entity, a time series comprising of timestamps and corresponding confidence scores is generated and smoothed to reduce annotation noise. One or more segments containing an entity over the length of the video are obtained by detecting boundaries of the segments in the time series of the entity. From the individual temporal segmentation for each identified entity in the video, an overall temporal segmentation for the video is generated, where the overall temporal segmentation reflects the semantics of the video.

    MUSIC SOUNDTRACK RECOMMENDATION ENGINE FOR VIDEOS
    2.
    发明申请
    MUSIC SOUNDTRACK RECOMMENDATION ENGINE FOR VIDEOS 有权
    MUSIC SOUNDTRACK推荐发动机用于视频

    公开(公告)号:US20140212106A1

    公开(公告)日:2014-07-31

    申请号:US14243692

    申请日:2014-04-02

    Applicant: Google Inc.

    Abstract: A system and method provide a soundtrack recommendation service for recommending one or more soundtrack for a video (i.e., a probe video). A feature extractor of the recommendation service extracts a set of content features of the probe video and generates a set of semantic features represented by a signature vector of the probe video. A video search module of the recommendation service is configured to search for a number of video candidates, each of which is semantically similar to the probe video and has an associated soundtrack. A video outlier identification module of the recommendation service identifies video candidates having an atypical use of their soundtracks and ranks the video candidates based on the typicality of their soundtrack usage. A soundtrack recommendation module selects the soundtracks of the top ranked video candidates as the soundtrack recommendations to the probe video.

    Abstract translation: 一种系统和方法提供一种配音推荐服务,用于为视频(即,探测视频)推荐一个或多个音带。 推荐服务的特征提取器提取探测视频的一组内容特征,并生成由探测视频的签名矢量表示的一组语义特征。 推荐服务的视频搜索模块被配置为搜索多个视频候选,其中每个视频候选者在语义上类似于探测视频并且具有相关联的音带。 推荐服务的视频异常值识别模块识别具有非正式使用其音轨的视频候选者,并且基于其音轨使用的典型性来排列视频候选。 音轨推荐模块选择顶级视频候选的音轨作为探测视频的配乐推荐。

    Mobile device audio playback
    4.
    发明授权

    公开(公告)号:US09235203B1

    公开(公告)日:2016-01-12

    申请号:US14505036

    申请日:2014-10-02

    Applicant: Google Inc.

    Abstract: This disclosure is directed to providing audio playback to a mobile device user. According to one aspect of this disclosure, a mobile device may be to modify audio playback in response to detecting an inclination of the mobile device (and thereby a user) with respect to a reference plane. According to another aspect of this disclosure, a mobile device may be configured to automatically identify an audible sound that may be motivational to a user, and store an indication of the audible sound in response to the identification. According to another aspect of this disclosure, a mobile device may automatically play back a previously identified motivational song in response to detection of user movement.

    Predicting video start times for maximizing user engagement

    公开(公告)号:US10390067B1

    公开(公告)日:2019-08-20

    申请号:US15593448

    申请日:2017-05-12

    Applicant: Google Inc.

    Abstract: Implementations disclose predicting video start times for maximizing user engagement. A method includes receiving a first content item comprising content item segments, processing the first content item using a trained machine learning model that is trained based on interaction signals and audio-visual content features of a training set of training segments of training content items, and obtaining, based on the processing of the first content item using the trained machine learning model, one or more outputs comprising salience scores for the content item segments, the salience scores indicating which content item segment of the content item segments is to be selected as a starting point for playback of the first content item.

    Predicting video start times for maximizing user engagement

    公开(公告)号:US09659218B1

    公开(公告)日:2017-05-23

    申请号:US14699243

    申请日:2015-04-29

    Applicant: Google Inc.

    CPC classification number: G06K9/00744

    Abstract: Implementations disclose predicting video start times for maximizing user engagement. A method includes applying a machine-learned model to audio-visual content features of segments of a target content item, the machine-learned model trained based on user interaction signals and audio-visual content features of a training set of content item segments, calculating, based on applying the machine-learned model, a salience score for each of the segments of the target content item, and selecting, based on the calculated salience scores, one of the segments of the target content item as a starting point for playback of the target content item.

    ENTITY BASED TEMPORAL SEGMENTATION OF VIDEO STREAMS
    8.
    发明申请
    ENTITY BASED TEMPORAL SEGMENTATION OF VIDEO STREAMS 有权
    基于实体的视频流的时间分段

    公开(公告)号:US20160335499A1

    公开(公告)日:2016-11-17

    申请号:US14712071

    申请日:2015-05-14

    Applicant: Google Inc.

    CPC classification number: G06K9/00765 G06K9/6269 G06K9/66 H04N5/91

    Abstract: A solution is provided for temporally segmenting a video based on analysis of entities identified in the video frames of the video. The video is decoded into multiple video frames and multiple video frames are selected for annotation. The annotation process identifies entities present in a sample video frame and each identified entity has a timestamp and confidence score indicating the likelihood that the entity is accurately identified. For each identified entity, a time series comprising of timestamps and corresponding confidence scores is generated and smoothed to reduce annotation noise. One or more segments containing an entity over the length of the video are obtained by detecting boundaries of the segments in the time series of the entity. From the individual temporal segmentation for each identified entity in the video, an overall temporal segmentation for the video is generated, where the overall temporal segmentation reflects the semantics of the video.

    Abstract translation: 提供了一种解决方案,用于基于在视频的视频帧中识别的实体的分析来对视频进行时间分割。 将视频解码为多个视频帧,并选择多个视频帧进行注释。 注释过程识别存在于样本视频帧中的实体,并且每个识别的实体具有指示实体准确识别的可能性的时间戳和置信度分数。 对于每个识别的实体,产生并平滑包括时间戳和对应的置信度分数的时间序列以减少注释噪声。 通过检测实体的时间序列中的段的边界来获得包含视频长度上的实体的一个或多个段。 根据视频中每个被识别实体的个体时间分割,生成视频的总体时间分割,其中整体时间分段反映视频的语义。

    Selecting and Presenting Representative Frames for Video Previews
    9.
    发明申请
    Selecting and Presenting Representative Frames for Video Previews 有权
    选择和呈现视频预览的代表帧

    公开(公告)号:US20160070962A1

    公开(公告)日:2016-03-10

    申请号:US14848216

    申请日:2015-09-08

    Applicant: Google Inc.

    Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.

    Abstract translation: 提供了一种用于选择视频的代表性帧的计算机实现的方法。 该方法包括接收视频并识别视频的每个帧的一组特征。 特征包括基于帧的特征和语义特征。 识别语义概念的可能性的语义特征作为视频帧中的内容呈现。 随后生成视频的一组视频段。 每个视频段包括来自视频的帧的按时间顺序的子集,并且每个帧与语义特征中的至少一个相关联。 该方法至少基于语义特征为每个视频段的帧子集的每帧生成分数,并且基于视频段中的帧的分数为每个视频段选择代表性的帧。 代表性的框架代表和总结视频段。

    Music soundtrack recommendation engine for videos
    10.
    发明授权
    Music soundtrack recommendation engine for videos 有权
    视频的音乐配乐推荐引擎

    公开(公告)号:US09148619B2

    公开(公告)日:2015-09-29

    申请号:US14243692

    申请日:2014-04-02

    Applicant: Google Inc.

    Abstract: A system and method provide a soundtrack recommendation service for recommending one or more soundtrack for a video (i.e., a probe video). A feature extractor of the recommendation service extracts a set of content features of the probe video and generates a set of semantic features represented by a signature vector of the probe video. A video search module of the recommendation service is configured to search for a number of video candidates, each of which is semantically similar to the probe video and has an associated soundtrack. A video outlier identification module of the recommendation service identifies video candidates having an atypical use of their soundtracks and ranks the video candidates based on the typicality of their soundtrack usage. A soundtrack recommendation module selects the soundtracks of the top ranked video candidates as the soundtrack recommendations to the probe video.

    Abstract translation: 一种系统和方法提供一种配音推荐服务,用于为视频(即,探测视频)推荐一个或多个音带。 推荐服务的特征提取器提取探测视频的一组内容特征,并生成由探测视频的签名矢量表示的一组语义特征。 推荐服务的视频搜索模块被配置为搜索多个视频候选,其中每个视频候选者在语义上类似于探测视频并且具有相关联的音带。 推荐服务的视频异常值识别模块识别具有非正式使用其音轨的视频候选者,并且基于其音轨使用的典型性来排列视频候选。 音轨推荐模块选择顶级视频候选的音轨作为探测视频的配乐推荐。

Patent Agency Ranking