Region extraction in vector images
    1.
    发明授权
    Region extraction in vector images 有权
    矢量图像中的区域提取

    公开(公告)号:US07088845B2

    公开(公告)日:2006-08-08

    申请号:US10767135

    申请日:2004-01-28

    IPC分类号: G06K9/00

    摘要: A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method perform a region based motion estimation between each spatially segmented region and the previous frame to computed the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object's boundary is unknown.

    摘要翻译: 语义对象跟踪方法在整个矢量图像序列中跟踪具有多个非刚性运动,断开组件和多种颜色的一般语义对象。 该方法通过对来自当前帧的图像区域进行空间分割,然后对这些区域进行分类,从而准确地跟踪这些一般语义对象,以便它们在前一帧中源自哪个语义对象。 为了对每个区域进行分类,该方法在每个空间分段区域和前一帧之间执行基于区域的运动估计,以计算先前帧中的预测区域的位置。 该方法然后根据前一帧中的哪个语义对象包含预测区域的最重叠点将当前帧中的每个区域分类为语义对象的一部分。 使用该方法,当前图像中的每个区域被跟踪到来自前一帧的一个语义对象,没有间隙或重叠。 该方法传播很少或没有错误,因为它将区域投影到预先计算语义对象边界的框架中,而不是尝试在对象边界未知的框架中投影和调整边界。

    Tracking semantic objects in vector image sequences
    2.
    发明授权
    Tracking semantic objects in vector image sequences 有权
    跟踪矢量图像序列中的语义对象

    公开(公告)号:US07162055B2

    公开(公告)日:2007-01-09

    申请号:US11171448

    申请日:2005-06-29

    IPC分类号: G06K9/00 G06K9/34 H04N5/225

    摘要: A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method performs a region based motion estimation between each spatially segmented region and the previous frame to compute the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object's boundary is unknown.

    摘要翻译: 语义对象跟踪方法在整个矢量图像序列中跟踪具有多个非刚性运动,断开组件和多种颜色的一般语义对象。 该方法通过对来自当前帧的图像区域进行空间分割,然后对这些区域进行分类,从而准确地跟踪这些一般语义对象,以便它们在前一帧中源自哪个语义对象。 为了对每个区域进行分类,该方法在每个空间分段区域和前一帧之间执行基于区域的运动估计,以计算前一帧中的预测区域的位置。 该方法然后根据前一帧中的哪个语义对象包含预测区域的最重叠点将当前帧中的每个区域分类为语义对象的一部分。 使用该方法,当前图像中的每个区域被跟踪到来自前一帧的一个语义对象,没有间隙或重叠。 该方法传播很少或没有错误,因为它将区域投影到预先计算语义对象边界的框架中,而不是尝试在对象边界未知的框架中投影和调整边界。

    Semantic video object segmentation and tracking

    公开(公告)号:US06400831B1

    公开(公告)日:2002-06-04

    申请号:US09054280

    申请日:1998-04-02

    IPC分类号: G06K900

    摘要: A semantic video object extraction system using mathematical morphology and perspective motion modeling. A user indicates a rough outline around an image feature of interest for a first frame in a video sequence. Without further user assistance, the rough outline is processed by a morphological segmentation tool to snap the rough outline into a precise boundary surrounding the image feature. Motion modeling is performed on the image feature to track its movement into a subsequent video frame. The motion model is applied to the precise boundary to warp the precise outline into a new rough outline for the image feature in the subsequent video frame. This new rough outline is then snapped to locate a new precise boundary. Automatic processing is repeated for subsequent video frames.

    Tracking semantic objects in vector image sequences
    4.
    发明申请
    Tracking semantic objects in vector image sequences 有权
    跟踪矢量图像序列中的语义对象

    公开(公告)号:US20050240629A1

    公开(公告)日:2005-10-27

    申请号:US11171448

    申请日:2005-06-29

    摘要: A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method perform a region based motion estimation between each spatially segmented region and the previous frame to computed the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object's boundary is unknown.

    摘要翻译: 语义对象跟踪方法在整个矢量图像序列中跟踪具有多个非刚性运动,断开组件和多种颜色的一般语义对象。 该方法通过对来自当前帧的图像区域进行空间分割,然后对这些区域进行分类,从而准确地跟踪这些一般语义对象,以便它们在前一帧中源自哪个语义对象。 为了对每个区域进行分类,该方法在每个空间分段区域和前一帧之间执行基于区域的运动估计,以计算先前帧中的预测区域的位置。 该方法然后根据前一帧中的哪个语义对象包含预测区域的最重叠点将当前帧中的每个区域分类为语义对象的一部分。 使用该方法,当前图像中的每个区域被跟踪到来自前一帧的一个语义对象,没有间隙或重叠。 该方法传播很少或没有错误,因为它将区域投影到预先计算语义对象边界的框架中,而不是尝试在对象边界未知的框架中投影和调整边界。

    Tracking semantic objects in vector image sequences
    5.
    发明授权
    Tracking semantic objects in vector image sequences 有权
    跟踪矢量图像序列中的语义对象

    公开(公告)号:US06711278B1

    公开(公告)日:2004-03-23

    申请号:US09151368

    申请日:1998-09-10

    IPC分类号: G06K900

    摘要: A semantic object tracking method tracks general semantic objects with multiple non-rigid motion, disconnected components and multiple colors throughout a vector image sequence. The method accurately tracks these general semantic objects by spatially segmenting image regions from a current frame and then classifying these regions as to which semantic object they originated from in the previous frame. To classify each region, the method performs a region based motion estimation between each spatially segmented region and the previous frame to compute the position of a predicted region in the previous frame. The method then classifies each region in the current frame as being part of a semantic object based on which semantic object in the previous frame contains the most overlapping points of the predicted region. Using this method, each region in the current image is tracked to one semantic object from the previous frame, with no gaps or overlaps. The method propagates few or no errors because it projects regions into a frame where the semantic object boundaries are previously computed rather than trying to project and adjust a boundary in a frame where the object's boundary is unknown.

    摘要翻译: 语义对象跟踪方法在整个矢量图像序列中跟踪具有多个非刚性运动,断开组件和多种颜色的一般语义对象。 该方法通过对来自当前帧的图像区域进行空间分割,然后对这些区域进行分类,从而准确地跟踪这些一般语义对象,以便它们在前一帧中源自哪个语义对象。 为了对每个区域进行分类,该方法在每个空间分段区域和前一帧之间执行基于区域的运动估计,以计算前一帧中的预测区域的位置。 该方法然后根据前一帧中的哪个语义对象包含预测区域的最重叠点将当前帧中的每个区域分类为语义对象的一部分。 使用该方法,当前图像中的每个区域被跟踪到来自前一帧的一个语义对象,没有间隙或重叠。 该方法传播很少或没有错误,因为它将区域投影到预先计算语义对象边界的框架中,而不是尝试在对象边界未知的框架中投影和调整边界。

    Morphological pure speech detection using valley percentage
    6.
    发明授权
    Morphological pure speech detection using valley percentage 有权
    形态纯语音检测使用谷百分比

    公开(公告)号:US06205422B1

    公开(公告)日:2001-03-20

    申请号:US09201705

    申请日:1998-11-30

    IPC分类号: G10L1102

    CPC分类号: G10L25/78

    摘要: A human speech detection method detects pure-speech signals in an audio signal containing a mixture of pure-speech and non-speech or mixed-speech signals. The method accurately detects the pure-speech signals by computing a novel Valley Percentage feature from the audio signal and then classifying the audio signals into pure-speech and non-speech (or mixed-speech) classifications. The Valley Percentage is a measurement of the low energy parts of the audio signal (the valley) in comparison to the high energy parts of the audio signal (the mountain). To classify the audio signal, the method performs a threshold decision on the value of the Valley Percentage. Using a binary mask, a high Valley Percentage is classified as pure-speech and a low Valley Percentage is classified as non-speech (or mixed-speech). The method further employs morphological filters to improve the accuracy of human speech detection. Before detection, a morphological closing filter may be employed to eliminate unwanted noise from the audio signal. After detection, a combination of morphological closing and opening filters may be employed to remove aberrant pure-speech and non-speech classifications from the binary mask resulting from impulsive audio signals in order to more accurately detect the boundaries between the pure-speech and non-speech portions of the audio signal. A number of parameters may be employed by the method to further improve the accuracy of human speech detection. For implementation in supervised digital audio signal applications, these parameters may be optimized by training the application a priori. For implementation in an unsupervised environment, adaptive determination of these parameters is also possible.

    摘要翻译: 人类语音检测方法检测包含纯语音和非语音或混合语音信号的混合的音频信号中的纯语音信号。 该方法通过从音频信号计算出新颖的谷百分比特征,然后将音频信号分类为纯语音和非语音(或混合语音)分类,从而准确地检测纯语音信号。 与音频信号(山)的高能部分相比,谷百分比是音频信号(谷)的低能量部分的测量。 为了对音频信号进行分类,该方法对谷百分比的值执行阈值判定。 使用二进制面具,高谷百分比被归类为纯言语,低谷百分比被归类为非言语(或混合语音)。 该方法还采用形态滤波器来提高人类语音检测的准确性。 在检测之前,可以使用形态闭合滤波器来消除来自音频信号的不需要的噪声。 在检测之后,可以采用形态闭合和打开滤波器的组合来从脉冲音频信号产生的二进制掩码中去除异常纯语音和非语音分类,以更准确地检测纯语音和非语音分类之间的边界。 音频信号的语音部分。 通过该方法可以采用多个参数来进一步提高人类语音检测的准确性。 为了在监督数字音频信号应用中实现,可以先通过训练应用来优化这些参数。 为了在无监督的环境中实现,这些参数的自适应确定也是可能的。

    Method for generating sprites for object-based coding sytems using masks
and rounding average
    7.
    发明授权
    Method for generating sprites for object-based coding sytems using masks and rounding average 失效
    使用掩码和舍入平均值生成基于对象的编码系统的精灵的方法

    公开(公告)号:US06037988A

    公开(公告)日:2000-03-14

    申请号:US881901

    申请日:1997-06-23

    摘要: A sprite generation method used in video coding generates a sprite from the video objects in the frames of a video sequence. The method estimates the motion between a video object in a current frame and a sprite constructed from video objects for previous frames. Specifically, the method computes motion coefficients of a 2D transform that minimizes the intensity errors between pixels in the video object and corresponding pixels inside the sprite. The method uses the motion coefficients from the previous frame as a starting point to minimizing the intensity errors. After estimating the motion parameters for an object in the current frame, the method transforms the video object to the coordinate system of the sprite. The method blends the warped pixels of the video object with the pixels at corresponding positions in the sprite using rounding average such that each video object in the video sequence provides substantially the same contribution to the sprite.

    摘要翻译: 视频编码中使用的子画面生成方法从视频序列的帧中的视频对象生成子画面。 该方法估计当前帧中的视频对象与由先前帧的视频对象构成的精灵之间的运动。 具体地,该方法计算2D视频对象中的像素之间的强度误差和子画面内的对应像素的2D变换的运动系数。 该方法使用来自前一帧的运动系数作为起始点来最小化强度误差。 在估计当前帧中的对象的运动参数之后,该方法将视频对象转换为子画面的坐标系。 该方法将视频对象的翘曲像素与使用舍入平均值的子画面中的相应位置处的像素混合,使得视频序列中的每个视频对象为子画面提供基本相同的贡献。

    Method and system for filtering compressed video images
    8.
    发明授权
    Method and system for filtering compressed video images 失效
    用于过滤压缩视频图像的方法和系统

    公开(公告)号:US5787203A

    公开(公告)日:1998-07-28

    申请号:US588056

    申请日:1996-01-19

    IPC分类号: G06T9/00 H04N7/26 G06K9/00

    CPC分类号: H04N19/23 H04N19/80

    摘要: A video compression error signal in a video compression scheme is affected by random and high frequency impulse noise. An error signal suppressor containing two filters is applied to the video compression error signal. The first filter reduces or eliminates random noise. The second filter eliminates high frequency impulse noise. Random and high frequency noise is reduced or eliminated from frequencies that are unimportant to human visual perception. The error signal suppressor reduces the overall video compression bitrate by up to between 10% and 20% which provides corresponding increases in video compression and transmission efficiency. The error signal suppressor is used in video compression encoding schemes such as MPEG to reduce random and high frequency noise.

    摘要翻译: 视频压缩方案中的视频压缩误差信号受到随机和高频脉冲噪声的影响。 包含两个滤波器的误差信号抑制器被应用于视频压缩误差信号。 第一个滤波器减少或消除随机噪声。 第二个滤波器消除了高频脉冲噪声。 随机和高频噪声从对人类视觉感知不重要的频率减少或消除。 误差信号抑制器可将整体视频压缩比特率降低10%至20%,从而提高视频压缩和传输效率。 误差信号抑制器用于诸如MPEG的视频压缩编码方案中以减少随机和高频噪声。

    Generation and provision of media metadata
    9.
    发明授权
    Generation and provision of media metadata 有权
    生成和提供媒体元数据

    公开(公告)号:US08763068B2

    公开(公告)日:2014-06-24

    申请号:US12964597

    申请日:2010-12-09

    IPC分类号: H04N7/16

    摘要: Various embodiments related to the generation and provision of media metadata are disclosed. For example, one disclosed embodiment provides a computing device having a logic subsystem configured to execute instructions, and a data holding subsystem comprising instructions stored thereon that are executable by the processor to receive an input of a video and/or audio content item, and to compare the content item to one or more object descriptors each representing an object for locating within the content item to locate instances of one or more of the objects in the content item. The instructions are further executable to generate metadata for each object located in the video content item, and to receive a validating user input related to whether the metadata generated for a selected object is correct.

    摘要翻译: 公开了与生成和提供媒体元数据相关的各种实施例。 例如,一个公开的实施例提供了具有被配置为执行指令的逻辑子系统的计算设备,以及包括存储在其上的指令的数据保持子系统,其可由处理器执行以接收视频和/或音频内容项目的输入,并且 将内容项目与一个或多个对象描述符进行比较,每个对象描述符表示用于在内容项目中定位的对象以定位内容项目中的一个或多个对象的实例。 指令还可执行以为位于视频内容项目中的每个对象生成元数据,并且接收与为所选对象生成的元数据是否正确相关的验证用户输入。

    Rotation and scaling optimization for mobile devices
    10.
    发明授权
    Rotation and scaling optimization for mobile devices 失效
    移动设备的旋转和缩放优化

    公开(公告)号:US07710434B2

    公开(公告)日:2010-05-04

    申请号:US11755082

    申请日:2007-05-30

    申请人: Chuang Gu

    发明人: Chuang Gu

    摘要: Image processing in mobile devices is optimized by combining at least two of the color conversion, rotation, and scaling operations. Received images, such as still images or frames of video stream, are subjected to a combined transformation after decoding, where each pixel is color converted (e.g. from YUV to RGB), rotated, and scaled as needed. By combining two or three of the processes into one, read/write operations consuming significant processing and memory resources are reduced enabling processing of higher resolution images and/or power and processing resource savings.

    摘要翻译: 通过组合至少两个颜色转换,旋转和缩放操作来优化移动设备中的图像处理。 接收到的图像,例如静止图像或视频流的帧,在解码之后进行组合变换,其中每个像素被颜色转换(例如从YUV到RGB),旋转和根据需要进行缩放。 通过将两个或三个进程组合​​成一个,消耗重要处理和存储器资源的读/写操作被减少,使得能够处理更高分辨率图像和/或功率并且处理资源节省。