Multi-modal gender recognition including depth data
    62.
    发明授权
    Multi-modal gender recognition including depth data 有权
    包括深度数据在内的多式别性别认同

    公开(公告)号:US08675981B2

    公开(公告)日:2014-03-18

    申请号:US12813675

    申请日:2010-06-11

    IPC分类号: G06K9/40 G06K9/62

    摘要: Gender recognition is performed using two or more modalities. For example, depth image data and one or more types of data other than depth image data is received. The data pertains to a person. The different types of data are fused together to automatically determine gender of the person. A computing system can subsequently interact with the person based on the determination of gender.

    摘要翻译: 使用两种或多种方式进行性别识别。 例如,深度图像数据和深度图像数据以外的一种或多种类型的数据被接收。 数据属于一个人。 将不同类型的数据融合在一起,以自动确定该人的性别。 计算系统随后可以基于性别的确定与人交互。

    Immersive Remote Conferencing
    63.
    发明申请
    Immersive Remote Conferencing 有权
    沉浸式远程会议

    公开(公告)号:US20120281059A1

    公开(公告)日:2012-11-08

    申请号:US13100504

    申请日:2011-05-04

    IPC分类号: H04N7/15

    摘要: The subject disclosure is directed towards an immersive conference, in which participants in separate locations are brought together into a common virtual environment (scene), such that they appear to each other to be in a common space, with geometry, appearance, and real-time natural interaction (e.g., gestures) preserved. In one aspect, depth data and video data are processed to place remote participants in the common scene from the first person point of view of a local participant. Sound data may be spatially controlled, and parallax computed to provide a realistic experience. The scene may be augmented with various data, videos and other effects/animations.

    摘要翻译: 本发明涉及一种身临其境的会议,其中分开的位置的参与者被聚集到一个共同的虚拟环境(场景)中,使得它们彼此看起来处于共同的空间中,具有几何,外观和实时性, 保留时间自然的相互作用(如手势)。 在一个方面,深度数据和视频数据被处理以将远程参与者从本地参与者的第一人的角度放置在公共场景中。 声音数据可以是空间控制的,并且计算视差以提供真实的体验。 场景可能会增加各种数据,视频和其他效果/动画。

    PHOTO-REALISTIC SYNTHESIS OF THREE DIMENSIONAL ANIMATION WITH FACIAL FEATURES SYNCHRONIZED WITH SPEECH
    64.
    发明申请
    PHOTO-REALISTIC SYNTHESIS OF THREE DIMENSIONAL ANIMATION WITH FACIAL FEATURES SYNCHRONIZED WITH SPEECH 有权
    具有与语音同步的特征的三维动画的照片 - 现实综合

    公开(公告)号:US20120280974A1

    公开(公告)日:2012-11-08

    申请号:US13099387

    申请日:2011-05-03

    IPC分类号: G06T13/40 G06T15/00

    摘要: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.

    摘要翻译: 动态纹理映射用于创建具有与期望语音同步的面部特征的个体的逼真的三维动画。 读取已知脚本的个人的视听数据被获取并存储在音频库和图像库中。 处理视听数据以提取用于训练统计模型的特征向量。 提供对应于动画将被同步的期望语音的输入音频特征向量。 统计模型用于生成对应于输入音频特征向量的视觉特征向量的轨迹。 这些视觉特征向量用于识别来自图像库的匹配图像序列。 从图像库连接的所得到的图像序列提供具有与所需语音同步的面部特征(例如唇部移动)的照片写实图像序列。 该图像序列应用于三维模型。

    Multi-modal device power/mode management
    65.
    发明授权
    Multi-modal device power/mode management 有权
    多模式设备电源/模式管理

    公开(公告)号:US08180465B2

    公开(公告)日:2012-05-15

    申请号:US12014419

    申请日:2008-01-15

    IPC分类号: G05B11/01 G05D3/12

    摘要: A system that facilitates managing resources (e.g., functionality, services) based at least in part upon an established context. More particularly, a context determination component can be employed to establish a context by processing sensor inputs or learning/inferring a user action/preference. Once the context is established via context determination component, a power/mode management component can be employed to activate and/or mask resources in accordance with the established context. The power and mode management of the device can extend life of a power source (e.g., battery) and mask functionality in accordance with a user and/or device state.

    摘要翻译: 一种有助于至少部分地基于建立的上下文来管理资源(例如,功能,服务)的系统。 更具体地,可以采用上下文确定组件来通过处理传感器输入或学习/推断用户动作/偏好来建立上下文。 一旦通过上下文确定组件建立了上下文,则可以使用功率/模式管理组件来根据建立的上下文激活和/或掩蔽资源。 设备的功率和模式管理可以根据用户和/或设备状态延长电源(例如电池)的寿命和屏蔽功能。

    Learning image enhancement
    66.
    发明授权
    Learning image enhancement 有权
    学习图像增强

    公开(公告)号:US08175382B2

    公开(公告)日:2012-05-08

    申请号:US11801620

    申请日:2007-05-10

    IPC分类号: G06K9/00

    摘要: Image enhancement techniques are described to enhance an image in accordance with a set of training images. In an implementation, an image color tone map is generated for a facial region included in an image. The image color tone map may be normalized to a color tone map for a set of training images so that the image color tone map matches the map for the training images. The normalized color tone map may be applied to the image to enhance the in-question image. In further implementations, the procedure may be updated when the average color intensity in non-facial regions differs from an accumulated mean by a threshold amount.

    摘要翻译: 描述图像增强技术以根据一组训练图像来增强图像。 在实现中,为包括在图像中的面部区域生成图像色调映射。 图像色调图可以被归一化为用于一组训练图像的色调图,使得图像色调图匹配训练图像的图。 归一化色调图可以应用于图像以增强问题图像。 在进一步的实施中,当非面部区域中的平均颜色强度与积累的平均值不同阈值量时,可以更新该过程。

    Text-dependent speaker verification
    67.
    发明授权
    Text-dependent speaker verification 有权
    文字相关说明者验证

    公开(公告)号:US08099288B2

    公开(公告)日:2012-01-17

    申请号:US11674139

    申请日:2007-02-12

    IPC分类号: G10L11/00 G10L21/00 G10L15/00

    CPC分类号: G10L17/24 G10L17/14

    摘要: A text-dependent speaker verification technique that uses a generic speaker-independent speech recognizer for robust speaker verification, and uses the acoustical model of a speaker-independent speech recognizer as a background model. Instead of using a likelihood ratio test (LRT) at the utterance level (e.g., the sentence level), which is typical of most speaker verification systems, the present text-dependent speaker verification technique uses weighted sum of likelihood ratios at the sub-unit level (word, tri-phone, or phone) as well as at the utterance level.

    摘要翻译: 一种文本相关的扬声器验证技术,其使用一般的与扬声器无关的语音识别器进行强大的扬声器验证,并使用与扬声器无关的语音识别器的声学模型作为背景模型。 现在的文本相关说明者验证技术不是在大多数说话人验证系统的典型的话语级别(例如,句子级别)上使用似然比检验(LRT),而是使用子单元中的似然比加权和 水平(单词,三话电话或电话)以及话语水平。

    HIERARCHICAL FILTERED MOTION FIELD FOR ACTION RECOGNITION
    68.
    发明申请
    HIERARCHICAL FILTERED MOTION FIELD FOR ACTION RECOGNITION 有权
    分层过滤运动场作用识别

    公开(公告)号:US20110311137A1

    公开(公告)日:2011-12-22

    申请号:US12820143

    申请日:2010-06-22

    IPC分类号: G06K9/34

    摘要: Described is a hierarchical filtered motion field technology such as for use in recognizing actions in videos with crowded backgrounds. Interest points are detected, e.g., as 2D Harris corners with recent motion, e.g. locations with high intensities in a motion history image (MHI). A global spatial motion smoothing filter is applied to the gradients of MHI to eliminate low intensity corners that are likely isolated, unreliable or noisy motions. At each remaining interest point, a local motion field filter is applied to the smoothed gradients by computing a structure proximity between sets of pixels in the local region and the interest point. The motion at a pixel/pixel set is enhanced or weakened based on its structure proximity with the interest point (nearer pixels are enhanced).

    摘要翻译: 描述了一种分层过滤的运动场技术,例如用于识别具有拥挤背景的视频中的动作。 检测到兴趣点,例如,作为具有最近运动的2D哈里斯角,例如, 在运动历史图像(MHI)中具有高强度的位置。 将全局空间运动平滑滤波器应用于MHI的梯度以消除可能是孤立的,不可靠的或噪声运动的低强度拐角。 在每个剩余的兴趣点处,通过计算局部区域中的像素集合和兴趣点之间的结构接近度,将局部运动场滤波器应用于平滑的梯度。 基于其与兴趣点的结构接近(更近的像素被增强),像素/像素集合处的运动被增强或削弱。

    Multimodal authentication
    69.
    发明授权
    Multimodal authentication 有权
    多模式认证

    公开(公告)号:US08079079B2

    公开(公告)日:2011-12-13

    申请号:US11171145

    申请日:2005-06-29

    摘要: A multimodal system that employs a plurality of sensing modalities which can be processed concurrently to increase confidence in connection with authentication. The multimodal system and/or set of various devices can provide several points of information entry in connection with authentication. Authentication can be improved, for example, by combining face recognition, biometrics, speech recognition, handwriting recognition, gait recognition, retina scan, thumb/hand prints, or subsets thereof. Additionally, portable multimodal devices (e.g., a smartphone) can be used as credit cards, and authentication in connection with such use can mitigate unauthorized transactions.

    摘要翻译: 采用多个感测模式的多模式系统,可以同时处理以增加与认证相关联的置信度。 多模式系统和/或各种设备的集合可以提供与认证相关联的多个信息点。 可以通过组合人脸识别,生物识别,语音识别,手写识别,步态识别,视网膜扫描,拇指/手印或其子集来改进认证。 此外,便携式多模式设备(例如,智能电话)可以用作信用卡,并且与此类使用相关的认证可以减轻未经授权的交易。

    Recovering parameters from a sub-optimal image
    70.
    发明授权
    Recovering parameters from a sub-optimal image 有权
    从次优图像中恢复参数

    公开(公告)号:US08009880B2

    公开(公告)日:2011-08-30

    申请号:US11747695

    申请日:2007-05-11

    IPC分类号: G06K9/00 G06K9/56 G09G5/00

    摘要: A subregion-based image parameter recovery system and method for recovering image parameters from a single image containing a face taken under sub-optimal illumination conditions. The recovered image parameters (including albedo, illumination, and face geometry) can be used to generate face images under a new lighting environment. The method includes dividing the face in the image into numerous smaller regions, generating an albedo morphable model for each region, and using a Markov Random Fields (MRF)-based framework to model the spatial dependence between neighboring regions. Different types of regions are defined, including saturated, shadow, regular, and occluded regions. Each pixel in the image is classified and assigned to a region based on intensity, and then weighted based on its classification. The method decouples the texture from the geometry and illumination models, and then generates an objective function that is iteratively solved using an energy minimization technique to recover the image parameters.

    摘要翻译: 一种基于子区域的图像参数恢复系统和方法,用于从包含在次优照明条件下拍摄的面部的单个图像恢复图像参数。 恢复的图像参数(包括反照率,照明和脸部几何)可用于在新的照明环境下生成脸部图像。 该方法包括将图像中的脸部划分成许多较小的区域,为每个区域生成反照变形模型,并使用基于马尔可夫随机场(MRF)的框架来模拟相邻区域之间的空间依赖关系。 定义不同类型的区域,包括饱和,阴影,常规和遮挡区域。 将图像中的每个像素分类并分配给基于强度的区域,然后基于其分类进行加权。 该方法将纹理与几何和照明模型分离,然后生成使用能量最小化技术迭代求解以恢复图像参数的目标函数。