Learning image enhancement
    71.
    发明授权
    Learning image enhancement 有权
    学习图像增强

    公开(公告)号:US08175382B2

    公开(公告)日:2012-05-08

    申请号:US11801620

    申请日:2007-05-10

    IPC分类号: G06K9/00

    摘要: Image enhancement techniques are described to enhance an image in accordance with a set of training images. In an implementation, an image color tone map is generated for a facial region included in an image. The image color tone map may be normalized to a color tone map for a set of training images so that the image color tone map matches the map for the training images. The normalized color tone map may be applied to the image to enhance the in-question image. In further implementations, the procedure may be updated when the average color intensity in non-facial regions differs from an accumulated mean by a threshold amount.

    摘要翻译: 描述图像增强技术以根据一组训练图像来增强图像。 在实现中,为包括在图像中的面部区域生成图像色调映射。 图像色调图可以被归一化为用于一组训练图像的色调图,使得图像色调图匹配训练图像的图。 归一化色调图可以应用于图像以增强问题图像。 在进一步的实施中,当非面部区域中的平均颜色强度与积累的平均值不同阈值量时,可以更新该过程。

    HIERARCHICAL FILTERED MOTION FIELD FOR ACTION RECOGNITION
    72.
    发明申请
    HIERARCHICAL FILTERED MOTION FIELD FOR ACTION RECOGNITION 有权
    分层过滤运动场作用识别

    公开(公告)号:US20110311137A1

    公开(公告)日:2011-12-22

    申请号:US12820143

    申请日:2010-06-22

    IPC分类号: G06K9/34

    摘要: Described is a hierarchical filtered motion field technology such as for use in recognizing actions in videos with crowded backgrounds. Interest points are detected, e.g., as 2D Harris corners with recent motion, e.g. locations with high intensities in a motion history image (MHI). A global spatial motion smoothing filter is applied to the gradients of MHI to eliminate low intensity corners that are likely isolated, unreliable or noisy motions. At each remaining interest point, a local motion field filter is applied to the smoothed gradients by computing a structure proximity between sets of pixels in the local region and the interest point. The motion at a pixel/pixel set is enhanced or weakened based on its structure proximity with the interest point (nearer pixels are enhanced).

    摘要翻译: 描述了一种分层过滤的运动场技术,例如用于识别具有拥挤背景的视频中的动作。 检测到兴趣点,例如,作为具有最近运动的2D哈里斯角,例如, 在运动历史图像(MHI)中具有高强度的位置。 将全局空间运动平滑滤波器应用于MHI的梯度以消除可能是孤立的,不可靠的或噪声运动的低强度拐角。 在每个剩余的兴趣点处,通过计算局部区域中的像素集合和兴趣点之间的结构接近度,将局部运动场滤波器应用于平滑的梯度。 基于其与兴趣点的结构接近(更近的像素被增强),像素/像素集合处的运动被增强或削弱。

    Multimodal authentication
    73.
    发明授权
    Multimodal authentication 有权
    多模式认证

    公开(公告)号:US08079079B2

    公开(公告)日:2011-12-13

    申请号:US11171145

    申请日:2005-06-29

    摘要: A multimodal system that employs a plurality of sensing modalities which can be processed concurrently to increase confidence in connection with authentication. The multimodal system and/or set of various devices can provide several points of information entry in connection with authentication. Authentication can be improved, for example, by combining face recognition, biometrics, speech recognition, handwriting recognition, gait recognition, retina scan, thumb/hand prints, or subsets thereof. Additionally, portable multimodal devices (e.g., a smartphone) can be used as credit cards, and authentication in connection with such use can mitigate unauthorized transactions.

    摘要翻译: 采用多个感测模式的多模式系统,可以同时处理以增加与认证相关联的置信度。 多模式系统和/或各种设备的集合可以提供与认证相关联的多个信息点。 可以通过组合人脸识别,生物识别,语音识别,手写识别,步态识别,视网膜扫描,拇指/手印或其子集来改进认证。 此外,便携式多模式设备(例如,智能电话)可以用作信用卡,并且与此类使用相关的认证可以减轻未经授权的交易。

    Recovering parameters from a sub-optimal image
    74.
    发明授权
    Recovering parameters from a sub-optimal image 有权
    从次优图像中恢复参数

    公开(公告)号:US08009880B2

    公开(公告)日:2011-08-30

    申请号:US11747695

    申请日:2007-05-11

    IPC分类号: G06K9/00 G06K9/56 G09G5/00

    摘要: A subregion-based image parameter recovery system and method for recovering image parameters from a single image containing a face taken under sub-optimal illumination conditions. The recovered image parameters (including albedo, illumination, and face geometry) can be used to generate face images under a new lighting environment. The method includes dividing the face in the image into numerous smaller regions, generating an albedo morphable model for each region, and using a Markov Random Fields (MRF)-based framework to model the spatial dependence between neighboring regions. Different types of regions are defined, including saturated, shadow, regular, and occluded regions. Each pixel in the image is classified and assigned to a region based on intensity, and then weighted based on its classification. The method decouples the texture from the geometry and illumination models, and then generates an objective function that is iteratively solved using an energy minimization technique to recover the image parameters.

    摘要翻译: 一种基于子区域的图像参数恢复系统和方法,用于从包含在次优照明条件下拍摄的面部的单个图像恢复图像参数。 恢复的图像参数(包括反照率,照明和脸部几何)可用于在新的照明环境下生成脸部图像。 该方法包括将图像中的脸部划分成许多较小的区域,为每个区域生成反照变形模型,并使用基于马尔可夫随机场(MRF)的框架来模拟相邻区域之间的空间依赖关系。 定义不同类型的区域,包括饱和,阴影,常规和遮挡区域。 将图像中的每个像素分类并分配给基于强度的区域,然后基于其分类进行加权。 该方法将纹理与几何和照明模型分离,然后生成使用能量最小化技术迭代求解以恢复图像参数的目标函数。

    Energy-based sound source localization and gain normalization
    76.
    发明授权
    Energy-based sound source localization and gain normalization 有权
    基于能量的声源定位和增益归一化

    公开(公告)号:US07924655B2

    公开(公告)日:2011-04-12

    申请号:US11623643

    申请日:2007-01-16

    IPC分类号: H04R5/02

    摘要: An energy based technique to estimate the positions of people speaking from an ad hoc network of microphones. The present technique does not require accurate synchronization of the microphones. In addition, a technique to normalize the gains of the microphones based on people's speech is presented, which allows aggregation of various audio channels from the ad hoc microphone network into a single stream for audio conferencing. The technique is invariant of the speaker's volumes thus making the system easy to deploy in practice.

    摘要翻译: 一种基于能量的技术来估计从麦克风的自组织网络发言的人的位置。 本技术不需要麦克风的准确同步。 此外,提出了一种基于人们的语音来归一化麦克风的增益的技术,其允许将各种音频频道从专用麦克风网络聚合成用于音频会议的单个流。 该技术是扬声器音量不变的,从而使得系统在实践中容易部署。

    Calibration based beamforming, non-linear adaptive filtering, and multi-sensor headset
    77.
    发明授权
    Calibration based beamforming, non-linear adaptive filtering, and multi-sensor headset 有权
    基于校准的波束成形,非线性自适应滤波和多传感器耳机

    公开(公告)号:US07813923B2

    公开(公告)日:2010-10-12

    申请号:US11251164

    申请日:2005-10-14

    IPC分类号: G10L15/20 G10L21/02 H04R15/00

    摘要: A first set of signals from an array of one or more microphones, and a second signal from a reference microphone are used to calibrate a set of filter parameters such that the filter parameters minimize a difference between the second signal and a beamformer output signal that is based on the first set of signals. Once calibrated, the filter parameters are used to form a beamformer output signal that is filtered using a non-linear adaptive filter that is adapted based on portions of a signal that do not contain speech, as determined by a speech detection sensor.

    摘要翻译: 使用来自一个或多个麦克风的阵列的第一组信号和来自参考麦克风的第二信号来校准一组滤波器参数,使得滤波器参数最小化第二信号与波束形成器输出信号之间的差异, 基于第一组信号。 一旦被校准,滤波器参数被用于形成使用非线性自适应滤波器进行滤波的波束形成器输出信号,所述非线性自适应滤波器基于由语音检测传感器确定的不包含语音的信号的部分而被适配。

    Multi-Device Capture and Spatial Browsing of Conferences
    78.
    发明申请
    Multi-Device Capture and Spatial Browsing of Conferences 有权
    会议的多设备捕获和空间浏览

    公开(公告)号:US20100085416A1

    公开(公告)日:2010-04-08

    申请号:US12245774

    申请日:2008-10-06

    IPC分类号: H04N7/14

    CPC分类号: H04N7/157 H04N7/147

    摘要: Multi-device capture and spatial browsing of conferences is described. In one implementation, a system detects cameras and microphones, such as the webcams on participants' notebook computers, in a conference room, group meeting, or table game, and enlists an ad-hoc array of available devices to capture each participant and the spatial relationships between participants. A video stream composited from the array is browsable by a user to navigate a 3-dimensional representation of the meeting. Each participant may be represented by a video pane, a foreground object, or a 3-D geometric model of the participant's face or body displayed in spatial relation to the other participants in a 3-dimensional arrangement analogous to the spatial arrangement of the meeting. The system may automatically re-orient the 3-dimensional representation as needed to best show the currently interesting event such as current speaker or may extend navigation controls to a user for manually viewing selected participants or nuanced interactions between participants.

    摘要翻译: 描述会议的多设备捕获和空间浏览。 在一个实现中,系统检测相机和麦克风,例如参与者的笔记本计算机上的网络摄像机,会议室,组会议或桌面游戏,并且招募可用设备的特设阵列以捕获每个参与者和空间 参与者之间的关系。 从阵列合成的视频流可由用户浏览以浏览会议的三维表示。 每个参与者可以以类似于会议的空间安排的三维布置的视频窗格,前景对象或与其他参与者以空间关系显示的三维几何模型来表示。 该系统可以根据需要自动重新定向三维表示,以最佳地显示当前有趣的事件,例如当前的扬声器,或者可以将导航控件扩展到用户,以便手动地观看选定的参与者或参与者之间微妙的交互。

    Method and system for video clip compression
    79.
    发明授权
    Method and system for video clip compression 有权
    视频剪辑压缩的方法和系统

    公开(公告)号:US07612832B2

    公开(公告)日:2009-11-03

    申请号:US11092389

    申请日:2005-03-29

    IPC分类号: H04N9/64 H04N7/12

    摘要: In a method for compressing a video clip containing audio content and image content, an image and/or an audio portion of individual video frames of the video clip are analyzed. Next frame scores are calculated for the video frames. Each frame score is based on at least one image attribute of the image of the video frame, and/or an audio attribute of the audio portion of the video frame. Next, key frames are identified that have a frame score that exceeds a threshold frame score. Finally, a compressed video clip is formed in which the images of non-key frames are removed. A system for implementing the method is also disclosed.

    摘要翻译: 在压缩包含音频内容和图像内容的视频剪辑的方法中,分析视频剪辑的各个视频帧的图像和/或音频部分。 计算视频帧的下一帧分数。 每个帧分数基于视频帧的图像的至少一个图像属性和/或视频帧的音频部分的音频属性。 接下来,识别具有超过阈值帧分数的帧分数​​的关键帧。 最后,形成压缩视频剪辑,其中去除非关键帧的图像。 还公开了一种用于实现该方法的系统。

    Learning image enhancement
    80.
    发明申请
    Learning image enhancement 有权
    学习图像增强

    公开(公告)号:US20080279467A1

    公开(公告)日:2008-11-13

    申请号:US11801620

    申请日:2007-05-10

    IPC分类号: G06K9/40

    摘要: Image enhancement techniques are described to enhance an image in accordance with a set of training images. In an implementation, an image color tone map is generated for a facial region included in an image. The image color tone map may be normalized to a color tone map for a set of training images so that the image color tone map matches the map for the training images. The normalized color tone map may be applied to the image to enhance the in-question image. In further implementations, the procedure may be updated when the average color intensity in non-facial regions differs from an accumulated mean by a threshold amount.

    摘要翻译: 描述图像增强技术以根据一组训练图像来增强图像。 在实现中,为包括在图像中的面部区域生成图像色调映射。 图像色调图可以被归一化为用于一组训练图像的色调图,使得图像色调图匹配训练图像的图。 归一化色调图可以应用于图像以增强问题图像。 在进一步的实施中,当非面部区域中的平均颜色强度与积累的平均值不同阈值量时,可以更新该过程。