Systems and methods for the automatic segmentation and clustering of ordered information
    2.
    发明授权
    Systems and methods for the automatic segmentation and clustering of ordered information 失效
    有序信息的自动分段和聚类的系统和方法

    公开(公告)号:US06915009B2

    公开(公告)日:2005-07-05

    申请号:US09947385

    申请日:2001-09-07

    CPC classification number: G06K9/00711 G06K9/00718 G06K9/00765 G06K9/6218

    Abstract: Techniques segmenting ordered information such as audio, video and text are provided by windowing and parameterizing an ordered information stream and storing of the parameterized and windowed information into a two-dimensional representation such as a matrix. The similarity between the parameter vectors is determined and an orthogonal matrix decomposition such as singular value decomposition is applied to the similarity matrix. The singular values or eigenvalues of the resulting decomposition indicate major components or segments of the ordered information. The boundaries of the major components may be determined using the determined singular vectors to provide, for example, smart cut-and-paste of ordered information in which boundaries are automatically identified by the singular vectors; automatic categorization and retrieval of ordered information and automatic summarization of ordered information.

    Abstract translation: 通过对有序信息流进行窗口化和参数化以及将参数化和加窗信息存储为诸如矩阵的二维表示来提供分类诸如音频,视频和文本的有序信息的技术。 确定参数向量之间的相似度,并将正交矩阵分解(如奇异值分解)应用于相似矩阵。 所得分解的奇异值或特征值表示有序信息的主要组成部分。 可以使用所确定的奇异向量来确定主要分量的边界,以提供例如智能切割和粘贴有序信息,其中边界由单个向量自动识别; 有序信息的自动分类和检索以及有序信息的自动汇总。

    Systems and methods for the automatic extraction of audio excerpts
    3.
    发明授权
    Systems and methods for the automatic extraction of audio excerpts 失效
    自动提取音频摘录的系统和方法

    公开(公告)号:US07260439B2

    公开(公告)日:2007-08-21

    申请号:US09985073

    申请日:2001-11-01

    CPC classification number: G11B27/28

    Abstract: A method of extracting audio excerpts comprises: segmenting audio data into a plurality of audio data segments; setting a fitness criteria for the plurality of audio data segments; analyzing the plurality of audio data segments based on the fitness criteria; and selecting one of the plurality of audio data segments that satisfies the fitness criteria. In various exemplary embodiments, the method of extracting audio excerpts further comprises associating the selected one of the plurality of audio data segments with video data. In such embodiments, associating the selected one of the plurality of audio data segments with video data may comprise associating the selected one of the plurality of audio data segments with a keyframe.

    Abstract translation: 提取音频摘录的方法包括:将音频数据分割成多个音频数据段; 为所述多个音频数据段设置适合性标准; 基于适合性标准分析多个音频数据段; 以及选择满足适合度标准的多个音频数据段中的一个。 在各种示例性实施例中,提取音频摘录的方法还包括将所述多个音频数据段中的所选择的一个与视频数据相关联。 在这样的实施例中,将多个音频数据段中的所选择的一个与视频数据相关联可以包括将多个音频数据段中的所选择的一个与关键帧相关联。

    Capturing and producing shared multi-resolution video
    5.
    发明授权
    Capturing and producing shared multi-resolution video 有权
    捕获和制作共享多分辨率视频

    公开(公告)号:US06839067B2

    公开(公告)日:2005-01-04

    申请号:US10205739

    申请日:2002-07-26

    CPC classification number: G08B13/19643 H04N7/142 H04N7/147 H04N7/15

    Abstract: A method and apparatus for providing multi-resolution video to multiple users under hybrid human and automatic control. Initial environment and close-up images are captured using a first camera and a PTZ camera. The initial images are then stored in memory. Current environment and close-up images are captured and the an estimated difference between the initial and current images and the true image is determined. The estimated differences are weighted and compared and the stored images are updated. A close-up image is then provided to each user of the system. The close-up camera is then directed to a portion of the environment image having high distortion, and current environment and close-up images are captured again.

    Abstract translation: 一种用于在混合人力和自动控制下向多个用户提供多分辨率视频的方法和装置。 使用第一台摄像机和一台PTZ摄像机拍摄初始环境和特写图像。 然后将初始图像存储在存储器中。 捕获当前环境和特写图像,并确定初始图像和当前图像与真实图像之间的估计差异。 估计的差异被加权和比较,并且存储的图像被更新。 然后将特写图像提供给系统的每个用户。 特写相机然后被引导到具有高失真的环境图像的一部分,并且再次捕获当前环境和特写图像。

    Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition
    6.
    发明授权
    Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition 有权
    用于使用图像相似性搜索和音频扬声器识别分割视听记录的方法和装置

    公开(公告)号:US06404925B1

    公开(公告)日:2002-06-11

    申请号:US09266561

    申请日:1999-03-11

    Abstract: Methods for segmenting audio-video recording of meetings containing slide presentations by one or more speakers are described. These segments serve as indexes into the recorded meeting. If an agenda is provided for the meeting, these segments can be labeled using information from the agenda. The system automatically detects intervals of video that correspond to presentation slides. Under the assumption that only one person is speaking during an interval when slides are displayed in the video, possible speaker intervals are extracted from the audio soundtrack by finding these regions. Since the same speaker may talk across multiple slide intervals, the acoustic data from these intervals is clustered to yield an estimate of the number of distinct speakers and their order. Clustering the audio data from these intervals yields an estimate of the number of different speakers and their order. Merged clustered audio intervals corresponding to a single speaker are then used as training data for a speaker segmentation system. Using speaker identification techniques, the full video is then segmented into individual presentations based on the extent of each presenter's speech. The speaker identification system optionally includes the construction of a hidden Markov model trained on the audio data from each slide interval. A Viterbi assignment then segments the audio according to speaker.

    Abstract translation: 描述了由一个或多个扬声器分割包含幻灯片呈现的会议音频视频记录的方法。 这些段作为记录会议的索引。 如果为会议提供议程,则可以使用来自议程的信息来标记这些细分。 系统自动检测与演示幻灯片相对应的视频间隔。 假设在视频中显示幻灯片的间隔期间只有一个人正在说话,通过查找这些区域,可以从音频音轨提取可能的扬声器间隔。 由于相同的说话者可以在多个幻灯片间隔中进行交谈,所以将来自这些间隔的声学数据进行聚类,以产生不同扬声器数量及其顺序的估计。 从这些间隔聚集音频数据产生不同扬声器数量及其顺序的估计。 然后将对应于单个扬声器的合并的群集音频间隔用作用于讲话者分割系统的训练数据。 使用扬声器识别技术,根据每位演讲者的讲话范围,将完整的视频分割成单独的演示文稿。 扬声器识别系统可选地包括针对来自每个幻灯片间隔的音频数据训练的隐马尔可夫模型的构造。 维特比分配然后根据扬声器分割音频。

    System and method for detecting and ranking images in order of usefulness based on vignette score
    7.
    发明授权
    System and method for detecting and ranking images in order of usefulness based on vignette score 有权
    用于基于小插曲得分的有用性检测和排序图像的系统和方法

    公开(公告)号:US07492921B2

    公开(公告)日:2009-02-17

    申请号:US11032576

    申请日:2005-01-10

    CPC classification number: G06F17/30247

    Abstract: A system and method for detecting useful images and for ranking images in order of usefulness based on a vignette score describing how closely each one resembles a “vignette,” or a central object or image surrounded by a featureless or deemphasized background. Several methods for determining an image's vignette score are disclosed as examples. Variance ratio analysis entails calculation of the ratio of variance between the edge region of the image and the entire image. Statistical model analysis entails developing a statistical classifier capable of determining a statistical model of each image class based on pre-entered training data. Spatial frequency analysis involves estimating the energy at different spatial frequencies in the central and edge regions and in the image as a whole. A vignette score is calculated as the ratio of mid-frequency energies in the edge region to the mid-frequency energies of the entire image.

    Abstract translation: 一种用于检测有用图像并根据用于评估图像的顺序对图像进行排序的系统和方法,所述小插曲得分描述了每个图像类似于“小插曲”的密切程度,或由无特征或不加重背景包围的中心对象或图像。 作为示例公开了用于确定图像晕影得分的几种方法。 方差比分析需要计算图像的边缘区域与整个图像之间的方差比。 统计模型分析需要开发能够基于预先输入的训练数据来确定每个图像类别的统计模型的统计分类器。 空间频率分析涉及估计中央和边缘区域以及整个图像中不同空间频率的能量。 晕影得分被计算为边缘区域中的中频能量与整个图像的中频能量的比率。

    Method, apparatus, and system for remotely annotating a target
    8.
    发明授权
    Method, apparatus, and system for remotely annotating a target 有权
    用于远程注释目标的方法,设备和系统

    公开(公告)号:US07333135B2

    公开(公告)日:2008-02-19

    申请号:US10271133

    申请日:2002-10-15

    CPC classification number: H04N7/18

    Abstract: A system, method and apparatus for remotely annotating an object. An embodiment of the present invention includes a video camera projector that captures video images of a local object and projects annotations made by a user at a remote location onto said local object.

    Abstract translation: 一种用于远程注释对象的系统,方法和装置。 本发明的一个实施例包括摄像机投影仪,其捕获本地对象的视频图像,并将远程位置处的用户作出的注释投影到所述本地对象上。

    Summarization of digital files
    9.
    发明授权
    Summarization of digital files 有权
    数字文件汇总

    公开(公告)号:US07284004B2

    公开(公告)日:2007-10-16

    申请号:US10271407

    申请日:2002-10-15

    Abstract: Embodiments of the present invention provide a method for producing a summary of a digital file on one or more computers. The method includes segmenting the digital file into a plurality of segments, clustering said segments into a plurality of clusters and selecting a cluster from said plurality of clusters wherein said selected cluster includes segments representative of said digital file. Upon selection of a cluster a segment of the cluster is provided as a summary of said digital file.

    Abstract translation: 本发明的实施例提供一种用于在一个或多个计算机上产生数字文件概要的方法。 该方法包括将数字文件分割成多个段,将所述段聚类成多个群集,并从所述多个群集中选择群集,其中所述选定的群集包括表示所述数字文件的段。 在选择集群时,提供集群的一部分作为所述数字文件的概要。

    Method for automatically producing optimal summaries of linear media

    公开(公告)号:US07068723B2

    公开(公告)日:2006-06-27

    申请号:US10086817

    申请日:2002-02-28

    Abstract: Optimal summaries of a linear media source are automatically produced by parameterizing a linear media source. The parameterized linear media source is used to create a similarity array in which each array element includes the value of a similarity measurement between a two portions of the parameterized media signal. A segment fitness function, adapted for measuring the similarity between a segment of the parameterized media signal and the entire parameterized media signal, is optimized to find an optimal segment location. The portion of the linear media source corresponding to the optimal segment location is selected as the optimal summary. This method produces optimal summaries of any type of linear media, such as video, audio, or text information.

Patent Agency Ranking