Unsupervised speaker segmentation of multi-speaker speech data
    101.
    发明授权
    Unsupervised speaker segmentation of multi-speaker speech data 有权
    多扬声器语音数据的无监督扬声器分割

    公开(公告)号:US07930179B1

    公开(公告)日:2011-04-19

    申请号:US11866125

    申请日:2007-10-02

    IPC分类号: G10L17/00

    CPC分类号: G10L17/12

    摘要: Systems and methods for unsupervised segmentation of multi-speaker speech or audio data by speaker. A front-end analysis is applied to input speech data to obtain feature vectors. The speech data is initially segmented and then clustered into groups of segments that correspond to different speakers. The clusters are iteratively modeled and resegmented to obtain stable speaker segmentations. The overlap between segmentation sets is checked to ensure successful speaker segmentation. Overlapping segments are combined and remodeled and resegmented. Optionally, the speech data is processed to produce a segmentation lattice to maximize the overall segmentation likelihood.

    摘要翻译: 用于扬声器的多扬声器语音或音频数据的无监督分割的系统和方法。 应用前端分析来输入语音数据以获得特征向量。 语音数据最初被分段,然后被聚集成对应于不同说话者的段的组。 这些簇被迭代地建模和重新分段以获得稳定的扬声器分割。 检查分割集之间的重叠以确保成功的说话者分割。 重叠片段被组合并重新构建并重新分段。 可选地,语音数据被处理以产生分割格子以最大化整体分割似然。

    Environment Delivery Network
    102.
    发明申请
    Environment Delivery Network 有权
    环境交付网络

    公开(公告)号:US20100185891A1

    公开(公告)日:2010-07-22

    申请号:US12355085

    申请日:2009-01-16

    IPC分类号: G06F11/07 G06F15/16

    摘要: A method for environmental delivery network prioritizes groups of data for transmission based on a various factors such as synchronization requirements, endpoint configuration, and the fidelity of sensory stimuli reproduction. A device detects data missing from a group of data received from a server and replaces the missing data with replacement data based on a predetermined value. The predetermined value may be based on a default value specific to the sensory stimulus missing data, data received prior to the missing data, or data received prior to and after the missing data.

    摘要翻译: 环境传递网络的方法基于诸如同步要求,端点配置和感觉刺激再现的保真度等各种因素来优先排列用于传输的数据组。 设备检测从服务器接收的一组数据中丢失的数据,并根据预定值替换丢失的数据与替换数据。 该预定值可以基于特定于感觉刺激缺失数据的缺省值,在丢失数据之前接收到的数据,或在丢失数据之前和之后接收的数据。

    SYSTEM AND METHOD FOR CREATING AND MANIPULATING SYNTHETIC ENVIRONMENTS
    103.
    发明申请
    SYSTEM AND METHOD FOR CREATING AND MANIPULATING SYNTHETIC ENVIRONMENTS 有权
    用于创造和操纵合成环境的系统和方法

    公开(公告)号:US20100157063A1

    公开(公告)日:2010-06-24

    申请号:US12343114

    申请日:2008-12-23

    IPC分类号: H04N5/225 H04N7/00

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for synthesizing a virtual window. The method includes receiving an environment feed, selecting video elements of the environment feed, displaying the selected video elements on a virtual window in a window casing, selecting non-video elements of the environment feed, and outputting the selected non-video elements coordinated with the displayed video elements. Environment feeds can include synthetic and natural elements. The method can further toggle the virtual window between displaying the selected elements and being transparent. The method can track user motion and adapt the displayed selected elements on the virtual window based on the tracked user motion. The method can further detect a user in close proximity to the virtual window, receive an interaction from the detected user, and adapt the displayed selected elements on the virtual window based on the received interaction.

    摘要翻译: 这里公开了用于合成虚拟窗口的系统,计算机实现的方法和有形的计算机可读介质。 该方法包括接收环境馈送,选择环境馈送的视频元素,在窗口框中的虚拟窗口上显示所选择的视频元素,选择环境馈送的非视频元素,以及输出所选择的非视频元素与 显示的视频元素。 环境饲料可以包括合成和天然元素。 该方法可以进一步在显示所选元素之间切换虚拟窗口并且是透明的。 该方法可以跟踪用户运动,并根据跟踪的用户运动来适应虚拟窗口上显示的所选元素。 该方法可以进一步检测与虚拟窗口非常接近的用户,接收来自检测到的用户的交互,以及基于所接收到的交互来使所显示的所选择的元素适应虚拟窗口。

    METHOD AND SYSTEM FOR INFORMATION QUERYING
    104.
    发明申请
    METHOD AND SYSTEM FOR INFORMATION QUERYING 有权
    信息查询方法与系统

    公开(公告)号:US20090070305A1

    公开(公告)日:2009-03-12

    申请号:US11851254

    申请日:2007-09-06

    IPC分类号: G06F17/30

    摘要: Methods and systems for information querying are described. At least one recent image of a video signal may be accessed. Recent text associated with the at least one recent image may be accessed. A presentation image may be provided from the at least one recent image for presentation on a display. An original portion of the recent text may be identified within the presentation image. A selection of a user portion of the recent text may be received. An information source may be queried with the selection of the user portion of the recent text. The information source may be capable of using the selection to provide a result.

    摘要翻译: 描述信息查询的方法和系统。 可以访问视频信号的至少一个最近的图像。 可以访问与至少一个最近图像相关联的最近文本。 可以从至少一个最近的图像提供呈现图像以在显示器上呈现。 可以在呈现图像内识别最近文本的原始部分。 可以接收对最近文本的用户部分的选择。 可以通过选择最近文本的用户部分来查询信息源。 信息源可能能够使用选择来提供结果。

    Method and apparatus for segmenting a multi-media program based upon audio events
    106.
    发明授权
    Method and apparatus for segmenting a multi-media program based upon audio events 失效
    基于音频事件分割多媒体节目的方法和装置

    公开(公告)号:US07319964B1

    公开(公告)日:2008-01-15

    申请号:US10862728

    申请日:2004-06-07

    申请人: Qian Huang Zhu Liu

    发明人: Qian Huang Zhu Liu

    IPC分类号: G10L21/00

    CPC分类号: G10L25/48

    摘要: The present invention provides for a method and apparatus for segmenting a multi-media program based upon audio events. In an embodiment a method of classifying an audio stream is provided. This method includes receiving an audio stream. Sampling the audio stream at a predetermined rate and then combining a predetermined number of samples into a clip. A plurality of features are then determined for the clip and are analyzed using a linear approximation algorithm. The clip is then characterized based upon the results of the analysis conducted with the linear approximation algorithm.

    摘要翻译: 本发明提供一种用于基于音频事件分割多媒体节目的方法和装置。 在一个实施例中,提供了对音频流进行分类的方法。 该方法包括接收音频流。 以预定速率对音频流进行采样,然后将预定数量的样本组合成剪辑。 然后为剪辑确定多个特征,并使用线性近似算法进行分析。 然后基于使用线性近似算法进行的分析的结果来表征该剪辑。

    Method and apparatus for segmenting a multi-media program based upon audio events
    107.
    发明授权
    Method and apparatus for segmenting a multi-media program based upon audio events 有权
    基于音频事件分割多媒体节目的方法和装置

    公开(公告)号:US06801895B1

    公开(公告)日:2004-10-05

    申请号:US09455492

    申请日:1999-12-06

    申请人: Qian Huang Zhu Liu

    发明人: Qian Huang Zhu Liu

    IPC分类号: G10L2100

    CPC分类号: G10L25/48

    摘要: The present invention provides for a method and apparatus for segmenting a multi-media program based upon audio events. In an embodiment a method of classifying an audio stream is provided. This method includes receiving an audio stream. Sampling the audio stream at a predetermined rate and then combining a predetermined number of samples into a clip. A plurality of features are then determined for the clip and are analyzed using a linear approximation algorithm. The clip is then characterized based upon the results of the analysis conducted with the linear approximation algorithm.

    摘要翻译: 本发明提供一种用于基于音频事件分割多媒体节目的方法和装置。 在一个实施例中,提供了对音频流进行分类的方法。 该方法包括接收音频流。 以预定速率对音频流进行采样,然后将预定数量的样本组合成剪辑。 然后为剪辑确定多个特征,并使用线性近似算法进行分析。 然后基于使用线性近似算法进行的分析的结果来表征该剪辑。

    Automated content detection, analysis, visual synthesis and repurposing
    110.
    发明授权
    Automated content detection, analysis, visual synthesis and repurposing 有权
    自动内容检测,分析,视觉合成和再利用

    公开(公告)号:US09167189B2

    公开(公告)日:2015-10-20

    申请号:US12579993

    申请日:2009-10-15

    摘要: A content summary is generated by determining a relevance of each of a plurality of scenes, removing at least one of the plurality of scenes based on the determined relevance, and creating a scene summary based on the plurality of scenes. The scene summary is output to a graphical user interface, which may be a three-dimensional interface. The plurality of scenes is automatically detected in a source video and a scene summary is created with user input to modify the scene summary. A synthetic frame representation is formed by determining a sentiment of at least one frame object in a plurality of frame objects and creating a synthetic representation of the at least one frame object based at least in part on the determined sentiment. The relevance of the frame object may be determined and the synthetic representation is then created based on the determined relevance and the determined sentiment.

    摘要翻译: 通过确定多个场景中的每个场景的相关性,基于所确定的相关性去除多个场景中的至少一个场景并基于多个场景创建场景摘要来生成内容摘要。 场景摘要被输出到图形用户界面,其可以是三维界面。 在源视频中自动检测多个场景,并且创建具有用户输入的场景摘要以修改场景摘要。 通过至少部分地基于确定的情绪来确定多个帧对象中的至少一个帧对象的情绪并创建至少一个帧对象的合成表示来形成合成帧表示。 可以确定帧对象的相关性,并且基于所确定的相关性和确定的情绪来创建合成表示。