Systems and methods for the automatic extraction of audio excerpts
    1.
    发明授权
    Systems and methods for the automatic extraction of audio excerpts 失效
    自动提取音频摘录的系统和方法

    公开(公告)号:US07260439B2

    公开(公告)日:2007-08-21

    申请号:US09985073

    申请日:2001-11-01

    CPC classification number: G11B27/28

    Abstract: A method of extracting audio excerpts comprises: segmenting audio data into a plurality of audio data segments; setting a fitness criteria for the plurality of audio data segments; analyzing the plurality of audio data segments based on the fitness criteria; and selecting one of the plurality of audio data segments that satisfies the fitness criteria. In various exemplary embodiments, the method of extracting audio excerpts further comprises associating the selected one of the plurality of audio data segments with video data. In such embodiments, associating the selected one of the plurality of audio data segments with video data may comprise associating the selected one of the plurality of audio data segments with a keyframe.

    Abstract translation: 提取音频摘录的方法包括:将音频数据分割成多个音频数据段; 为所述多个音频数据段设置适合性标准; 基于适合性标准分析多个音频数据段; 以及选择满足适合度标准的多个音频数据段中的一个。 在各种示例性实施例中,提取音频摘录的方法还包括将所述多个音频数据段中的所选择的一个与视频数据相关联。 在这样的实施例中,将多个音频数据段中的所选择的一个与视频数据相关联可以包括将多个音频数据段中的所选择的一个与关键帧相关联。

    Capturing and producing shared multi-resolution video
    3.
    发明授权
    Capturing and producing shared multi-resolution video 有权
    捕获和制作共享多分辨率视频

    公开(公告)号:US06839067B2

    公开(公告)日:2005-01-04

    申请号:US10205739

    申请日:2002-07-26

    CPC classification number: G08B13/19643 H04N7/142 H04N7/147 H04N7/15

    Abstract: A method and apparatus for providing multi-resolution video to multiple users under hybrid human and automatic control. Initial environment and close-up images are captured using a first camera and a PTZ camera. The initial images are then stored in memory. Current environment and close-up images are captured and the an estimated difference between the initial and current images and the true image is determined. The estimated differences are weighted and compared and the stored images are updated. A close-up image is then provided to each user of the system. The close-up camera is then directed to a portion of the environment image having high distortion, and current environment and close-up images are captured again.

    Abstract translation: 一种用于在混合人力和自动控制下向多个用户提供多分辨率视频的方法和装置。 使用第一台摄像机和一台PTZ摄像机拍摄初始环境和特写图像。 然后将初始图像存储在存储器中。 捕获当前环境和特写图像,并确定初始图像和当前图像与真实图像之间的估计差异。 估计的差异被加权和比较,并且存储的图像被更新。 然后将特写图像提供给系统的每个用户。 特写相机然后被引导到具有高失真的环境图像的一部分,并且再次捕获当前环境和特写图像。

    Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition
    4.
    发明授权
    Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition 有权
    用于使用图像相似性搜索和音频扬声器识别分割视听记录的方法和装置

    公开(公告)号:US06404925B1

    公开(公告)日:2002-06-11

    申请号:US09266561

    申请日:1999-03-11

    Abstract: Methods for segmenting audio-video recording of meetings containing slide presentations by one or more speakers are described. These segments serve as indexes into the recorded meeting. If an agenda is provided for the meeting, these segments can be labeled using information from the agenda. The system automatically detects intervals of video that correspond to presentation slides. Under the assumption that only one person is speaking during an interval when slides are displayed in the video, possible speaker intervals are extracted from the audio soundtrack by finding these regions. Since the same speaker may talk across multiple slide intervals, the acoustic data from these intervals is clustered to yield an estimate of the number of distinct speakers and their order. Clustering the audio data from these intervals yields an estimate of the number of different speakers and their order. Merged clustered audio intervals corresponding to a single speaker are then used as training data for a speaker segmentation system. Using speaker identification techniques, the full video is then segmented into individual presentations based on the extent of each presenter's speech. The speaker identification system optionally includes the construction of a hidden Markov model trained on the audio data from each slide interval. A Viterbi assignment then segments the audio according to speaker.

    Abstract translation: 描述了由一个或多个扬声器分割包含幻灯片呈现的会议音频视频记录的方法。 这些段作为记录会议的索引。 如果为会议提供议程,则可以使用来自议程的信息来标记这些细分。 系统自动检测与演示幻灯片相对应的视频间隔。 假设在视频中显示幻灯片的间隔期间只有一个人正在说话,通过查找这些区域,可以从音频音轨提取可能的扬声器间隔。 由于相同的说话者可以在多个幻灯片间隔中进行交谈,所以将来自这些间隔的声学数据进行聚类,以产生不同扬声器数量及其顺序的估计。 从这些间隔聚集音频数据产生不同扬声器数量及其顺序的估计。 然后将对应于单个扬声器的合并的群集音频间隔用作用于讲话者分割系统的训练数据。 使用扬声器识别技术,根据每位演讲者的讲话范围,将完整的视频分割成单独的演示文稿。 扬声器识别系统可选地包括针对来自每个幻灯片间隔的音频数据训练的隐马尔可夫模型的构造。 维特比分配然后根据扬声器分割音频。

    Force-feedback stylus and applications to freeform ink
    6.
    发明授权
    Force-feedback stylus and applications to freeform ink 有权
    力反馈笔和应用程序来自由形成墨水

    公开(公告)号:US07508382B2

    公开(公告)日:2009-03-24

    申请号:US10833062

    申请日:2004-04-28

    CPC classification number: G06F3/04883 G06F3/016 G06F3/03545 G06F3/03546

    Abstract: This invention relates to a force-feedback apparatus which includes a stylus that is equipped with an electromagnetic device or a freely rotating ball. The stylus is functionally coupled to a controller which is capable of exerting a magnetic field to the electromagnetic device or to the rotating ball, which results in a force being created between the stylus and a surface. This invention also relates to a method of using a force-feedback stylus including moving a force-feedback stylus over a surface, controlling a force-feedback device via a controller coupled to the force-feedback stylus and applying a force to the force-feedback stylus via the force-feedback device, the force being determined for at least features on the surface.

    Abstract translation: 本发明涉及一种力反馈装置,其包括配备有电磁装置或自由旋转的球的触针。 触针在功能上耦合到能够对电磁装置或旋转球施加磁场的控制器,这导致在触针和表面之间产生力。 本发明还涉及一种使用力反馈测针的方法,包括在表面上移动力反馈测针,通过耦合到力反馈测针的控制器来控制力反馈装置并向力反馈施加力 触针通过力反馈装置,该力被确定为表面上的至少特征。

    Systems and methods for media summarization
    7.
    发明授权
    Systems and methods for media summarization 有权
    媒体摘要的系统和方法

    公开(公告)号:US07424150B2

    公开(公告)日:2008-09-09

    申请号:US10728777

    申请日:2003-12-08

    CPC classification number: G11B27/28

    Abstract: A stream of ordered information, such as, for example, audio, video and/or text data, can be windowed and parameterized. A similarity between the parameterized and windowed stream of ordered information can be determined, and a probabilistic decomposition or probabilistic matrix factorization, such as non-negative matrix factorization, can be applied to the similarity matrix. The component matrices resulting from the decomposition indicate major components or segments of the ordered information. Excerpts can then be extracted from the stream of ordered information based on the component matrices to generate a summary of the stream of ordered information.

    Abstract translation: 可以对有序信息流(例如音频,视频和/或文本数据)进行加窗和参数化。 可以确定有序信息的参数化和窗口流之间的相似性,并且可以将概率分解或概率矩阵分解(例如非负矩阵因式分解)应用于相似矩阵。 由分解产生的分量矩阵表示有序信息的主要组成部分。 然后可以基于分量矩阵从有序信息流中提取摘录,以生成有序信息流的摘要。

    Image classifying systems and methods
    8.
    发明授权
    Image classifying systems and methods 有权
    图像分类系统和方法

    公开(公告)号:US07327347B2

    公开(公告)日:2008-02-05

    申请号:US10325913

    申请日:2002-12-23

    CPC classification number: G06F17/30265

    Abstract: Methods and systems for classifying images, such as photographs, allow a user to incorporate subjective judgments regarding photograph qualities when making classification decisions. A slide-show interface allows a user to classify and advance photographs with a one-key action or a single interaction event. The interface presents related information relevant to a displayed photograph that is to be classified, such as contiguous photographs, similar photographs, and other versions of the same photograph. The methods and systems provide an overview interface which allows a user to review and refine classification decisions in the context of the original sequence of photographs.

    Abstract translation: 用于分类图像(如照片)的方法和系统允许用户在进行分类决定时纳入关于照片质量的主观判断。 幻灯片显示界面允许用户使用单键操作或单个交互事件对照片进行分类和推进。 该界面显示与要分类的显示照片相关的相关信息,例如相邻照片,相似照片和相同照片的其他版本。 方法和系统提供概览界面,其允许用户在原始照片序列的上下文中审查和改进分类决定。

    Methods and apparatuses for interactive similarity searching, retrieval and browsing of video
    9.
    发明授权
    Methods and apparatuses for interactive similarity searching, retrieval and browsing of video 有权
    视频互动相似搜索,检索和浏览的方法和装置

    公开(公告)号:US07246314B2

    公开(公告)日:2007-07-17

    申请号:US10859832

    申请日:2004-06-03

    CPC classification number: G06K9/00758 G06F17/30814 G06F17/30825 G06F17/3084

    Abstract: Methods for interactive selecting video queries consisting of training images from a video for a video similarity search and for displaying the results of the similarity search are disclosed. The user selects a time interval in the video as a query definition of training images for training an image class statistical model. Time intervals can be as short as one frame or consist of disjoint segments or shots. A statistical model of the image class defined by the training images is calculated on-the-fly from feature vectors extracted from transforms of the training images. For each frame in the video, a feature vector is extracted from the transform of the frame, and a similarity measure is calculated using the feature vector and the image class statistical model. The similarity measure is derived from the likelihood of a Gaussian model producing the frame. The similarity is then presented graphically, which allows the time structure of the video to be visualized and browsed. Similarity can be rapidly calculated for other video files as well, which enables content-based retrieval by example. A content-aware video browser featuring interactive similarity measurement is presented. A method for selecting training segments involves mouse click-and-drag operations over a time bar representing the duration of the video; similarity results are displayed as shades in the time bar. Another method involves selecting periodic frames of the video as endpoints for the training segment.

    Abstract translation: 公开了用于交互式选择由用于视频相似性搜索的视频的训练图像组成的视频查询和用于显示相似性搜索的结果的方法。 用户选择视频中的时间间隔作为用于训练图像类统计模型的训练图像的查询定义。 时间间隔可以短到一帧,或者由不相交的片段或镜头组成。 从训练图像变换中提取的特征向量,计算由训练图像定义的图像类别的统计模型。 对于视频中的每个帧,从帧的变换中提取特征向量,并且使用特征向量和图像类统计模型来计算相似度度量。 相似性度量是从产生帧的高斯模型的可能性得出的。 然后以图形方式呈现相似性,这允许视频的时间结构可视化和浏览。 也可以为其他视频文件快速计算相似度,从而实现基于内容的检索。 介绍了具有交互式相似度测量功能的内容感知视频浏览器。 用于选择训练段的方法涉及通过表示视频持续时间的时间条来进行鼠标点击和拖动操作; 相似度结果在时间栏中显示为阴影。 另一种方法是选择视频的周期帧作为训练段的端点。

    Telepresence system and method for video teleconferencing
    10.
    发明授权
    Telepresence system and method for video teleconferencing 有权
    视讯会议的网真系统及方法

    公开(公告)号:US07154526B2

    公开(公告)日:2006-12-26

    申请号:US10617549

    申请日:2003-07-11

    CPC classification number: H04N7/142

    Abstract: A system in accordance with one embodiment of the present invention comprises a device for facilitating video communication between a remote participant and another location. The device can comprise a screen adapted to display the remote participant, the screen having a posture adapted to be controlled by the remote participant. A camera can be mounted adjacent to the screen, and can allow the subject to view a selected conference participant or a desired location such that when the camera is trained on the selected participant or desired location a gaze of the remote participant displayed by the screen appears substantially directed at the selected participant or desired location.

    Abstract translation: 根据本发明的一个实施例的系统包括用于促进远程参与者和另一位置之间的视频通信的装置。 设备可以包括适于显示远程参与者的屏幕,屏幕具有适于由远程参与者控制的姿势。 摄像机可以安装在屏幕附近,并且可以允许被摄体观看所选择的会议参与者或期望的位置,使得当在所选择的参与者或期望位置上训练相机时,屏幕显示的远程参与者的凝视出现 基本上指向所选择的参与者或期望位置。

Patent Agency Ranking