Calibration based beamforming, non-linear adaptive filtering, and multi-sensor headset
    71.
    发明授权
    Calibration based beamforming, non-linear adaptive filtering, and multi-sensor headset 有权
    基于校准的波束成形,非线性自适应滤波和多传感器耳机

    公开(公告)号:US07813923B2

    公开(公告)日:2010-10-12

    申请号:US11251164

    申请日:2005-10-14

    IPC分类号: G10L15/20 G10L21/02 H04R15/00

    摘要: A first set of signals from an array of one or more microphones, and a second signal from a reference microphone are used to calibrate a set of filter parameters such that the filter parameters minimize a difference between the second signal and a beamformer output signal that is based on the first set of signals. Once calibrated, the filter parameters are used to form a beamformer output signal that is filtered using a non-linear adaptive filter that is adapted based on portions of a signal that do not contain speech, as determined by a speech detection sensor.

    摘要翻译: 使用来自一个或多个麦克风的阵列的第一组信号和来自参考麦克风的第二信号来校准一组滤波器参数,使得滤波器参数最小化第二信号与波束形成器输出信号之间的差异, 基于第一组信号。 一旦被校准,滤波器参数被用于形成使用非线性自适应滤波器进行滤波的波束形成器输出信号,所述非线性自适应滤波器基于由语音检测传感器确定的不包含语音的信号的部分而被适配。

    Multi-Device Capture and Spatial Browsing of Conferences
    72.
    发明申请
    Multi-Device Capture and Spatial Browsing of Conferences 有权
    会议的多设备捕获和空间浏览

    公开(公告)号:US20100085416A1

    公开(公告)日:2010-04-08

    申请号:US12245774

    申请日:2008-10-06

    IPC分类号: H04N7/14

    CPC分类号: H04N7/157 H04N7/147

    摘要: Multi-device capture and spatial browsing of conferences is described. In one implementation, a system detects cameras and microphones, such as the webcams on participants' notebook computers, in a conference room, group meeting, or table game, and enlists an ad-hoc array of available devices to capture each participant and the spatial relationships between participants. A video stream composited from the array is browsable by a user to navigate a 3-dimensional representation of the meeting. Each participant may be represented by a video pane, a foreground object, or a 3-D geometric model of the participant's face or body displayed in spatial relation to the other participants in a 3-dimensional arrangement analogous to the spatial arrangement of the meeting. The system may automatically re-orient the 3-dimensional representation as needed to best show the currently interesting event such as current speaker or may extend navigation controls to a user for manually viewing selected participants or nuanced interactions between participants.

    摘要翻译: 描述会议的多设备捕获和空间浏览。 在一个实现中,系统检测相机和麦克风,例如参与者的笔记本计算机上的网络摄像机,会议室,组会议或桌面游戏,并且招募可用设备的特设阵列以捕获每个参与者和空间 参与者之间的关系。 从阵列合成的视频流可由用户浏览以浏览会议的三维表示。 每个参与者可以以类似于会议的空间安排的三维布置的视频窗格,前景对象或与其他参与者以空间关系显示的三维几何模型来表示。 该系统可以根据需要自动重新定向三维表示,以最佳地显示当前有趣的事件,例如当前的扬声器,或者可以将导航控件扩展到用户,以便手动地观看选定的参与者或参与者之间微妙的交互。

    VIRTUAL SOUND SOURCE POSITIONING
    73.
    发明申请
    VIRTUAL SOUND SOURCE POSITIONING 有权
    虚拟声源定位

    公开(公告)号:US20090310802A1

    公开(公告)日:2009-12-17

    申请号:US12140283

    申请日:2008-06-17

    IPC分类号: H04R5/02

    CPC分类号: H04S7/302 H04S2400/11

    摘要: Systems and methods for determining a virtual sound source position by determining an output for loudspeakers by the position of the loudspeakers in relation to a listener. The output of respective loudspeakers is generated using aural cues to give the listener knowledge of the virtual position of the virtual sound source. Both a gain in intensity and a delay are simulated.

    摘要翻译: 用于通过扬声器相对于收听者的位置确定扬声器的输出来确定虚拟声源位置的系统和方法。 使用听觉提示产生各个扬声器的输出,以使聆听者了解虚拟声源的虚拟位置。 模拟强度和延迟的增益。

    Automatic detection of panoramic camera position and orientation table parameters
    74.
    发明授权
    Automatic detection of panoramic camera position and orientation table parameters 有权
    自动检测全景相机的位置和方位表参数

    公开(公告)号:US07630571B2

    公开(公告)日:2009-12-08

    申请号:US11227046

    申请日:2005-09-15

    IPC分类号: G06K9/40 G06K9/48

    摘要: A panoramic camera is configured to automatically determine parameters of a table upon which the camera is situated as well as positional information of the camera relative to the table. In an initialization stage, table edges are detected to create an edge map. A Hough transformation-like symmetry voting operation is performed to clean up the edge map and to determine camera offset, camera orientation and camera tilt. The table is then fit to a table model to determine table parameters. In an operational stage, table edges are detected to create an edge map and the table model is fit to the edge map. The output can then be used for further panoramic image processing such as head size normalization, zooming, compensation for camera movement, etc.

    摘要翻译: 全景相机被配置为自动确定相机所在的表的参数以及相机相对于表的位置信息。 在初始化阶段,检测表边缘以创建边缘图。 执行霍夫变换对称投票操作来清理边缘图并确定相机偏移,相机方向和相机倾斜度。 然后将表适合于表模型以确定表参数。 在操作阶段,检测表边缘以创建边缘图,并且表模型适合边缘图。 然后可以将输出用于进一步的全景图像处理,例如头部尺寸归一化,变焦,相机移动补偿等。

    Method and system for video clip compression
    75.
    发明授权
    Method and system for video clip compression 有权
    视频剪辑压缩的方法和系统

    公开(公告)号:US07612832B2

    公开(公告)日:2009-11-03

    申请号:US11092389

    申请日:2005-03-29

    IPC分类号: H04N9/64 H04N7/12

    摘要: In a method for compressing a video clip containing audio content and image content, an image and/or an audio portion of individual video frames of the video clip are analyzed. Next frame scores are calculated for the video frames. Each frame score is based on at least one image attribute of the image of the video frame, and/or an audio attribute of the audio portion of the video frame. Next, key frames are identified that have a frame score that exceeds a threshold frame score. Finally, a compressed video clip is formed in which the images of non-key frames are removed. A system for implementing the method is also disclosed.

    摘要翻译: 在压缩包含音频内容和图像内容的视频剪辑的方法中,分析视频剪辑的各个视频帧的图像和/或音频部分。 计算视频帧的下一帧分数。 每个帧分数基于视频帧的图像的至少一个图像属性和/或视频帧的音频部分的音频属性。 接下来,识别具有超过阈值帧分数的帧分数​​的关键帧。 最后,形成压缩视频剪辑,其中去除非关键帧的图像。 还公开了一种用于实现该方法的系统。

    ADAPTING A PARAMETERIZED CLASSIFIER TO AN ENVIRONMENT
    76.
    发明申请
    ADAPTING A PARAMETERIZED CLASSIFIER TO AN ENVIRONMENT 审中-公开
    将参数化分类器适应环境

    公开(公告)号:US20090263010A1

    公开(公告)日:2009-10-22

    申请号:US12105275

    申请日:2008-04-18

    IPC分类号: G06F15/18 G06K9/62

    摘要: A classifier is trained on a first set of examples, and the trained classifier is adapted to perform on a second set of examples. The classifier implements a parameterized labeling function. Initial training of the classifier optimizes the labeling function's parameters to minimize a cost function. The classifier and its parameters are provided to an environment in which it will operate, along with an approximation function that approximates the cost function using a compact representation of the first set of examples in place of the actual first set. A second set of examples is collected, and the parameters are modified to minimize a combined cost of labeling the first and second sets of examples. The part of the combined cost that represents the cost of the modified parameters applied to the first set is calculated using the approximation function.

    摘要翻译: 在第一组示例上训练分类器,并且训练分类器适于在第二组示例上执行。 分类器实现参数化标签功能。 分类器的初始训练优化了标签函数的参数,以最小化成本函数。 分类器及其参数被提供给其将被操作的环境,以及使用第一组示例的紧凑表示代替实际的第一组近似成本函数的近似函数。 收集第二组示例,并修改参数以最小化标记第一组和第二组示例的组合成本。 使用近似函数计算代表施加到第一组的修改参数的成本的组合成本的部分。

    Audio-visual control system
    77.
    发明授权
    Audio-visual control system 有权
    视听系统

    公开(公告)号:US07518631B2

    公开(公告)日:2009-04-14

    申请号:US11168124

    申请日:2005-06-28

    IPC分类号: H04N7/14

    CPC分类号: G10L15/26

    摘要: A visual control system controls a controlled component. In one embodiment, the visual control system controls the controlled component based on a visual location of a user. In another embodiment, input from a visual perception device is used to provide focus control for an audio input device. In additional embodiments, the visual control system stops, starts or suppresses speech recognition or other audio functions when the direction of the sound detected by the audio input device is not coming from the user's visual location.

    摘要翻译: 视觉控制系统控制受控组件。 在一个实施例中,视觉控制系统基于用户的视觉位置来控制受控组件。 在另一个实施例中,来自视觉感知设备的输入被用于为音频输入设备提供焦点控制。 在另外的实施例中,当由音频输入设备检测到的声音的方向不是来自用户的视觉位置时,视觉控制系统停止,启动或抑制语音识别或其他音频功能。

    Head pose tracking system
    78.
    发明授权
    Head pose tracking system 有权
    头姿态跟踪系统

    公开(公告)号:US07515173B2

    公开(公告)日:2009-04-07

    申请号:US10154892

    申请日:2002-05-23

    IPC分类号: H04N7/14

    CPC分类号: H04N7/15 H04N7/144

    摘要: Video images representative of a conferee's head are received and evaluated with respect to a reference model to monitor a head position of the conferee. A personalized face model of the conferee is captured to track head position of the conferee. In a stereo implementation, first and second video images representative of a first conferee taken from different views are concurrently captured. A head position of the first conferee is tracked from the first and second video images. The tracking of head-position through a personalized model-based approach can be used in a number of applications such as human-computer interaction and eye-gaze correction for video conferencing.

    摘要翻译: 代表参加者头部的视频图像被接收并且相对于参考模型进行评估以监视与会者的头部位置。 捕获与会者的个性化面部模型,以跟踪与会者的头部位置。 在立体声实现中,同时捕获代表从不同视图拍摄的第一与会者的第一和第二视频图像。 从第一和第二视频图像追踪第一个与会者的头部位置。 通过基于个性化的基于模型的方法跟踪头位可以用于许多应用,例如用于视频会议的人机交互和眼睛注视校正。

    Multispectral digital camera employing both visible light and non-visible light sensing on a single image sensor
    79.
    发明授权
    Multispectral digital camera employing both visible light and non-visible light sensing on a single image sensor 有权
    在单个图像传感器上采用可见光和不可见光感测的多光谱数码相机

    公开(公告)号:US07460160B2

    公开(公告)日:2008-12-02

    申请号:US10949085

    申请日:2004-09-24

    IPC分类号: H04N3/14 H04N5/335

    摘要: A digital camera having a single image sensor made up of an array of filtered photosites used to capture non-visible light wavelengths in addition to the standard red/green/blue (RGB) or other visible light intensity values is presented. Essentially, this is accomplished using a separate filter disposed over each photosite that exhibits a light transmission function with regard to wavelength which passes only a prescribed range of wavelengths—some passing light in the visible light spectrum and others in the non-visible light spectrum. The photosites passing non-visible light wavelengths can be configured to pass light in the infrared (IR) light spectrum, which can be limited to just the near infrared (NIR) spectrum if desired, or alternately light in the ultra-violet (UV) light spectrum.

    摘要翻译: 提供了具有单个图像传感器的数字照相机,该单个图像传感器由除用于标准的红/绿/蓝(RGB)或其他可见光强度值之外的用于捕获不可见光波长的滤光片的阵列组成。 从本质上讲,这是通过设置在每个光子上的单独的滤光片来完成的,该滤光器仅通过规定波长范围的波长(在可见光光谱中的一些通过光,而在不可见光光谱中的其它光通过)表现出透光功能。 通过不可见光波长的光子可以被配置为使红色(IR)光谱中的光通过,如果需要,其可以仅限于近红外(NIR)光谱,或者在紫外线(UV) 光谱。

    Learning image enhancement
    80.
    发明申请
    Learning image enhancement 有权
    学习图像增强

    公开(公告)号:US20080279467A1

    公开(公告)日:2008-11-13

    申请号:US11801620

    申请日:2007-05-10

    IPC分类号: G06K9/40

    摘要: Image enhancement techniques are described to enhance an image in accordance with a set of training images. In an implementation, an image color tone map is generated for a facial region included in an image. The image color tone map may be normalized to a color tone map for a set of training images so that the image color tone map matches the map for the training images. The normalized color tone map may be applied to the image to enhance the in-question image. In further implementations, the procedure may be updated when the average color intensity in non-facial regions differs from an accumulated mean by a threshold amount.

    摘要翻译: 描述图像增强技术以根据一组训练图像来增强图像。 在实现中,为包括在图像中的面部区域生成图像色调映射。 图像色调图可以被归一化为用于一组训练图像的色调图,使得图像色调图匹配训练图像的图。 归一化色调图可以应用于图像以增强问题图像。 在进一步的实施中,当非面部区域中的平均颜色强度与积累的平均值不同阈值量时,可以更新该过程。