Video and multimedia browsing while switching between views
    1.
    发明授权
    Video and multimedia browsing while switching between views 有权
    视频和多媒体浏览,同时切换视图

    公开(公告)号:US06907570B2

    公开(公告)日:2005-06-14

    申请号:US09822035

    申请日:2001-03-29

    IPC分类号: G06F17/30 G09G5/00

    摘要: Preferred implementations of the invention permit a user to seamlessly switch from a first media stream to a second media stream in a synchronized way, such that the second media stream picks up where the first media stream left off. In this way, the user experiences events chronologically but without interruption. In a preferred implementation, a user watching a skim video switches to a full length video when, for example, the skim video reaches a frame that is of particular interest to the user. The full length video begins at a point corresponding to the frame in the skim video that is of interest to the user, without skipping over video segments, so that the user does not experience any time gaps in the story line.

    摘要翻译: 本发明的优选实现允许用户以同步的方式从第一媒体流无缝地切换到第二媒体流,使得第二媒体流拾取第一媒体流离开的位置。 以这种方式,用户按时间顺序经历事件但不中断。 在优选实施方式中,当例如滑动视频达到用户特别感兴趣的帧时,观看滑视频视频的用户切换到全长视频。 全长视频从与用户感兴趣的滑动视频中的帧对应的点开始,而不跳过视频段,使得用户在故事行中不会遇到任何时间间隙。

    System and method for automatic audio content analysis for word spotting, indexing, classification and retrieval
    3.
    发明授权
    System and method for automatic audio content analysis for word spotting, indexing, classification and retrieval 有权
    自动音频内容分析系统和方法,用于单词识别,索引,分类和检索

    公开(公告)号:US06185527B2

    公开(公告)日:2001-02-06

    申请号:US09234663

    申请日:1999-01-19

    IPC分类号: G10L1508

    摘要: A system and method for indexing an audio stream for subsequent information retrieval and for skimming, gisting, and summarizing the audio stream includes using special audio prefiltering such that only relevant speech segments that are generated by a speech recognition engine are indexed. Specific indexing features are disclosed that improve the precision and recall of an information retrieval system used after indexing for word spotting. The invention includes rendering the audio stream into intervals, with each interval including one or more segments. For each segment of an interval it is determined whether the segment exhibits one or more predetermined audio features such as a particular range of zero crossing rates, a particular range of energy, and a particular range of spectral energy concentration. The audio features are heuristically determined to represent respective audio events including silence, music, speech, and speech on music. Also, it is determined whether a group of intervals matches a heuristically predefined meta pattern such as continuous uninterrupted speech, concluding ideas, hesitations and emphasis in speech, and so on, and the audio stream is then indexed based on the interval classification and meta pattern matching, with only relevant features being indexed to improve subsequent precision of information retrieval. Also, alternatives for longer terms generated by the speech recognition engine are indexed along with respective weights, to improve subsequent recall.

    摘要翻译: 用于索引音频流以用于后续信息检索和用于撇号,提示和汇总音频流的系统和方法包括使用特殊音频预过滤器,使得只有由语音识别引擎生成的相关语音片段被索引。 公开了特定的索引特征,其提高了用于单词识别的索引之后使用的信息检索系统的精度和调用。 本发明包括将音频流呈现为间隔,每个间隔包括一个或多个段。 对于间隔的每个段,确定该段是否呈现一个或多个预定的音频特征,例如零交叉率的特定范围,特定的能量范围和频谱能量集中的特定范围。 音频特征被启发式地确定为表示各种音频事件,包括静音,音乐,语音和音乐上的语音。 此外,确定一组间隔是否与启发式预定义的元模式匹配,例如连续不间断语音,结语,犹豫和语音强调等,然后基于间隔分类和元模式对音频流进行索引 匹配,只有相关功能被索引,以提高信息检索的后续精度。 此外,由语音识别引擎产生的较长项目的替代方案与相应的权重一起索引,以改进后续的召回。

    System and method for the automatic discovery of salient segments in speech transcripts
    4.
    发明授权
    System and method for the automatic discovery of salient segments in speech transcripts 有权
    自动发现语音成绩单中显着部分的系统和方法

    公开(公告)号:US06928407B2

    公开(公告)日:2005-08-09

    申请号:US10109960

    申请日:2002-03-29

    IPC分类号: G10L15/18 G10L15/04 G06F17/27

    CPC分类号: G10L15/1822 Y10S707/99937

    摘要: A system and associated method automatically discover salient segments in a speech transcript and focus on the segmentation of an audio/video source into topically cohesive segments based on Automatic Speech Recognition (ASR) transcriptions. The word n-grams are extracted from the speech transcript using a three-phase segmentation algorithm based on the following sequence or combination of boundary-based and content-based methods: a boundary-based method; a rate of arrival of feature method; and a content-based method. In the first two segmentation passes, the temporal proximity and the rate of arrival of features are analyzed to compute an initial segmentation. In the third segmentation pass, changes in the set of content-bearing words used by adjacent segments are detected, to validate the initial segments for merging them, to prevent over-segmentation.

    摘要翻译: 系统和相关方法自动发现语音录音中的突出部分,并将焦点集中在基于自动语音识别(ASR)转录的音频/视频源到局部内聚片段中。 使用基于边界和基于内容的方法的以下序列或组合的三阶段分割算法从语音转录中提取单词n-gram:基于边界的方法; 特征法的到达率; 和基于内容的方法。 在前两个分割段中,分析特征的时间接近度和到达速率以计算初始分割。 在第三个分割段中,检测相邻段使用的含有内容的单词集合中的变化,以验证其合并的初始段,以防止过度分割。

    Fast video playback with automatic content based variable speed
    5.
    发明授权
    Fast video playback with automatic content based variable speed 有权
    快速视频播放,以自动内容为基础的变速

    公开(公告)号:US06760536B1

    公开(公告)日:2004-07-06

    申请号:US09572136

    申请日:2000-05-16

    IPC分类号: H04N591

    摘要: Browsing of digital video data is performed using a fast forward or fast reverse play mode. The digital video is analyzed and processed to produce a content-based variable-rate video playback sequence for fast browsing. To create the playback sequence, each shot in a video is sped-up at a relatively slow rate at the beginning of the shot by selecting many frames and then the speedup rate is increased as the shot progresses by selecting progressively fewer frames. This method and apparatus of variable-rate frame selection can be used to add and index to a video, play an original video in fast forward/backward mode or to create a new video—a fast forward playback video summary.

    摘要翻译: 使用快进或快退模式执行数字视频数据的浏览。 数字视频被分析和处理以产生用于快速浏览的基于内容的可变速率视频播放序列。 为了创建播放顺序,通过选择多个帧,视频中的每个镜头以相对较慢的速度加速,然后通过选择逐渐减少的帧来随着拍摄的进行而增加加速速率。 这种可变速率帧选择的方法和装置可以用于向视频添加和索引,以快进/后退模式播放原始视频或者创建新的视频 - 快进播放视频摘要。

    System and method for visualizing and navigating dynamic content in a graphical user interface
    6.
    发明授权
    System and method for visualizing and navigating dynamic content in a graphical user interface 失效
    用于在图形用户界面中可视化和导航动态内容的系统和方法

    公开(公告)号:US08010903B2

    公开(公告)日:2011-08-30

    申请号:US10034499

    申请日:2001-12-28

    IPC分类号: G06F3/00

    CPC分类号: G06F17/30994 G06F3/0485

    摘要: A system and method for visualizing and navigating dynamic documents including data from an ongoing process and including instances of specified search terms. A summary view including a condensed abstract representation of a dynamic document provides a global overview of the distribution of search terms. The invention updates the document and aggregates the instances of search terms when the representation includes a nonlinear scale or uses multiple display regions having different resolution levels. The invention supports rapid skimming of dynamic documents and dynamic document collections, including enhancements triggered by cursor brushing, while keeping the user in context. Navigation to a segment of the dynamic document by selecting a corresponding portion of the summary view can replace the use of conventional scrolling techniques.

    摘要翻译: 用于可视化和导航动态文档的系统和方法,包括来自持续进程的数据并包括指定搜索项的实例。 包括动态文档的精简抽象表示的摘要视图提供了搜索词的分布的全局概述。 当表示包括非线性比例或使用具有不同分辨率级别的多个显示区域时,本发明更新文档并聚合搜索项的实例。 本发明支持动态文档和动态文档集合的快速删除,包括通过光标刷新触发的增强,同时保持用户在上下文中。 通过选择摘要视图的相应部分导航到动态文档的段可以替代常规滚动技术的使用。

    System and method for hierarchical segmentation with latent semantic indexing in scale space
    8.
    发明授权
    System and method for hierarchical segmentation with latent semantic indexing in scale space 有权
    用于在尺度空间中进行潜在语义索引的层次分割的系统和方法

    公开(公告)号:US07137062B2

    公开(公告)日:2006-11-14

    申请号:US10034523

    申请日:2001-12-28

    IPC分类号: G06F3/00

    CPC分类号: G06F17/2745

    摘要: A system and method for automatically generating a hierarchical table of contents or outline for indexing a document and identifying clusters of related information in the document. The document may comprise text, audio, video, or a multimedia presentation. The invention employs a unique and novel combination of latent semantic indexing techniques to identify related blocks and major topic changes within the document with scale space segmentation techniques to respectively identify self-similar blocks within the document and to thus find topic changes of various sizes at block edges. The invention then produces a visual presentation of the semantic structure of the document.

    摘要翻译: 一种用于自动生成用于索引文档并且识别文档中的相关信息的集群的分层内容表或大纲的系统和方法。 文档可以包括文本,音频,视频或多媒体呈现。 本发明使用潜在语义索引技术的独特和新颖的组合来识别文档内的相关块和主要主题变化,其中缩放空间分割技术分别标识文档内的自相似块,并因此在块中找到各种尺寸的主题变化 边缘。 然后,本发明产生文档的语义结构的视觉呈现。