LOCAL BI-GRAM MODEL FOR OBJECT RECOGNITION
    33.
    发明申请
    LOCAL BI-GRAM MODEL FOR OBJECT RECOGNITION 有权
    用于对象识别的本地BI-GRAM模型

    公开(公告)号:US20080240551A1

    公开(公告)日:2008-10-02

    申请号:US11694938

    申请日:2007-03-30

    IPC分类号: G06K9/62

    CPC分类号: G06K9/468 G06K9/6296

    摘要: A local bi-gram model object recognition system and method for constructing a local bi-gram model and using the model to recognize objects in a query image. In a learning phase, the local bi-gram model is constructed that represents objects found in a set of training images. The local bi-gram model is a local spatial model that only models the relationship of neighboring features without any knowledge of their global context. Object recognition is performed by finding a set of matching primitives in the query image. A tree structure of matching primitives is generated and a search is performed to find a tree structure of matching primitives that obeys the local bi-gram model. The local bi-gram model can be found using unsupervised learning. The system and method also can be used to recognize objects unsupervised that are undergoing non-rigid transformations for both object instance recognition and category recognition.

    摘要翻译: 一种局部双向模型对象识别系统和方法,用于构建局部双向模型,并使用该模型来识别查询图像中的对象。 在学习阶段,构建了表示在一组训练图像中发现的对象的局部双语模型。 当地的双语模型是一种局部空间模型,它只对相邻特征的关系进行建模,而无需了解其全局环境。 通过在查询图像中找到一组匹配的基元来执行对象识别。 生成匹配原语的树形结构,并进行搜索以找到符合本地生成模型的匹配原语的树结构。 可以使用无监督学习找到当地的双语模型。 系统和方法也可用于识别无监督的对象实例识别和类别识别正在进行非刚性转换的对象。

    Method and apparatus for computer input using six degrees of freedom
    34.
    发明授权
    Method and apparatus for computer input using six degrees of freedom 有权
    使用六自由度的计算机输入的方法和装置

    公开(公告)号:US06844871B1

    公开(公告)日:2005-01-18

    申请号:US09563088

    申请日:2000-04-28

    摘要: A mouse is provided that uses a camera as its input sensor. A real-time vision algorithm determines the six degree-of-freedom mouse posture, consisting of 2D motion, tilt in the forward/back and left/right axes, rotation of the mouse about its vertical axis, and some limited height sensing. Thus, a familiar 2D device can be extended for three-dimensional manipulation, while remaining suitable for standard 2D Graphical User Interface tasks. The invention includes techniques for mouse functionality, 3D manipulation, navigating large 2D spaces, and using the camera for lightweight scanning tasks.

    摘要翻译: 提供使用相机作为其输入传感器的鼠标。 实时视觉算法确定了六维自由度的鼠标姿势,由2D运动,前/后和左/右轴的倾斜,鼠标围绕其垂直轴的旋转以及一些有限的高度感测组成。 因此,熟悉的2D设备可以扩展为三维操作,同时适用于标准2D图形用户界面任务。 本发明包括用于鼠标功能,3D操纵,导航大型2D空间以及将照相机用于轻量级扫描任务的技术。

    Stereo reconstruction from multiperspective panoramas
    35.
    发明授权
    Stereo reconstruction from multiperspective panoramas 有权
    多重全景拍摄的立体声重建

    公开(公告)号:US06639596B1

    公开(公告)日:2003-10-28

    申请号:US09399426

    申请日:1999-09-20

    IPC分类号: G06T1500

    摘要: A system and process for computing a 3D reconstruction of a scene using multiperspective panoramas. The reconstruction can be generated using a cylindrical sweeping approach, or under some conditions, traditional stereo matching algorithms. The cylindrical sweeping process involves projecting each pixel of the multiperspective panoramas onto each of a series of cylindrical surfaces of progressively increasing radii. For each pixel location on each cylindrical surface, a fitness metric is computed for all the pixels projected thereon to provide an indication of how closely a prescribed characteristic of the projected pixels matches. Then, for each respective group of corresponding pixel locations of the cylindrical surfaces, it is determined which location has a fitness metric that indicates the prescribed characteristic of the projected pixels matches more closely than the rest. For each of these winning pixel locations, its panoramic coordinates are designated as the position of the portion of the scene depicted by the pixels projected to that location. Additionally, in some cases a sufficiently horizontal epipolar geometry exists between multiperspective panoramas such that traditional stereo matching algorithms can be employed for the reconstruction. A symmetric pair of multiperspectives panoramas produces the horizontal epipolar geometry. In addition, this geometry is obtained if the distance from the center of rotation to the viewpoints used to capture the images employed to construct the panorama is small in comparison to the distance from the center of rotation to the nearest scene point depicted in the images, or if an off-axis angle is kept small.

    摘要翻译: 使用多目标全景图计算场景的3D重建的系统和过程。 可以使用圆柱扫描方法或在某些条件下,传统的立体匹配算法来生成重建。 圆筒扫描过程涉及将多目标全景的每个像素投影到逐渐增加的半径的一系列圆柱形表面中的每一个上。 对于每个圆柱形表面上的每个像素位置,针对投影到其上的所有像素计算适应度量,以提供投影像素的规定特征匹配的指示。 然后,对于圆柱形表面的每个相应的像素位置组,确定哪个位置具有指示投影像素的规定特征与其余部分更紧密匹配的适应度度。 对于这些获胜像素位置中的每一个,其全景坐标被指定为由投影到该位置的像素描绘的场景的部分的位置。 另外,在一些情况下,多目标全景之间存在足够水平的对极几何,使得传统的立体匹配算法可以用于重构。 一对多对多角度全景摄影产生水平对极几何。 此外,如果从旋转中心到用于捕获用于构建全景的图像的视点的距离与从旋转中心到图像中描绘的最近场景点的距离相比较,则获得该几何形状, 或者如果离轴角度保持较小。

    System and process for generating 3D video textures using video-based rendering techniques
    36.
    发明授权
    System and process for generating 3D video textures using video-based rendering techniques 有权
    使用基于视频的渲染技术生成3D视频纹理的系统和过程

    公开(公告)号:US06611268B1

    公开(公告)日:2003-08-26

    申请号:US09643635

    申请日:2000-08-22

    IPC分类号: G06T1570

    摘要: A system and process for generating a 3D video animation of an object referred to as a 3D Video Texture is presented. The 3D Video Texture is constructed by first simultaneously videotaping an object from two or more different cameras positioned at different locations. Video from, one of the cameras is used to extract, analyze and synthesize a video sprite of the object of interest. In addition, the first, contemporaneous, frames captured by at least two of the cameras are used to estimate a 3D depth map of the scene. The background of the scene contained within the depth map is then masked out, and a clear shot of the scene background taken before filming of the object began, leaving just the object. To generate each new frame in the 3D video animation, the extracted region making up a “frame” of the video sprite is mapped onto the previously generated 3D surface. The-resulting image is rendered from a novel viewpoint, and then combined with a flat image of the background which has been warped to the correct location. In cases where it is anticipated that the subject could move frequently, the foregoing part of the procedure associated with estimating a 3D depth map of the scene and extracting the 3D surface representation of the object is performed for each subsequent set of contemporaneous frames captured by at least two of the cameras.

    摘要翻译: 提出了一种用于生成称为3D视频纹理的对象的3D视频动画的系统和过程。 3D视频纹理通过首先同时从位于不同位置的两个或更多个不同摄像机进行录像来构造。 其中一台相机用于提取,分析和综合感兴趣对象的视频精灵。 另外,使用由至少两个照相机拍摄的第一个同时代的帧来估计场景的3D深度图。 深度图中包含的场景的背景被遮蔽,并且在拍摄对象之前拍摄的场景背景的清晰镜头开始,仅留下对象。 为了在3D视频动画中生成每个新帧,构成视频精灵的“帧”的提取区域被映射到先前生成的3D表面上。 从新颖的角度渲染所得到的图像,然后与已经翘曲到正确位置的背景的平面图像组合。 在预期受试者可能频繁移动的情况下,与估计场景的3D深度图和提取对象的3D表面表示相关联的过程的前述部分对于每个随后的由下列组合捕获的同时期帧执行: 至少两台相机。

    System and process for viewing panoramic video
    37.
    发明授权
    System and process for viewing panoramic video 有权
    用于观看全景视频的系统和过程

    公开(公告)号:US06559846B1

    公开(公告)日:2003-05-06

    申请号:US09611987

    申请日:2000-07-07

    IPC分类号: G06T1700

    摘要: The primary components of the panoramic video viewer include a decoder module. The purpose of the decoder module is to input incoming encoded panoramic video data and to output a decoded version thereof. The incoming data may be provided over a network and originate from a server, or it may simply be read from a storage media, such as a hard drive, CD or DVD. Once decoded, the data associated with each video frame is preferably stored in a storage module and made available to a 3D rendering module. The 3D rendering module is essentially a texture mapper that takes the frame data and maps the desired views onto a prescribed environment model. The output of the 3D rendering module is provided to a display module where the panoramic video is viewed by a user of the system. Typically, the user will be viewing just a portion of the scene depicted in the panoramic video at any one time, and will be able to control what portion is viewed. Preferably, the panoramic video viewer will allow the user to pan through the scene to the left, right, up or down. In addition, the user would preferably be able to zoom in or out within the portion of the scene being viewed. The user could also be allowed to select what video should be played, choose when to play or pause the video, and to specify what temporal part of the video should be played.

    摘要翻译: 全景视频观看器的主要部件包括解码器模块。 解码器模块的目的是输入输入的编码全景视频数据并输出其解码版本。 输入数据可以通过网络提供并且来自服务器,或者可以简单地从诸如硬盘驱动器,CD或DVD的存储介质读取。 一旦解码,与每个视频帧相关联的数据优选地存储在存储模块中并且可用于3D渲染模块。 3D渲染模块本质上是纹理映射器,其获取帧数据并将期望的视图映射到规定的环境模型上。 3D渲染模块的输出被提供给显示模块,其中全景视频被系统的用户观看。 通常,用户将在任何一个时间仅观看全景视频中描绘的场景的一部分,并且将能够控制观看的部分。 优选地,全景视频观看者将允许用户向左,右,上或下平移场景。 此外,用户最好能够在正在观看的场景的部分内放大或缩小。 也可以允许用户选择要播放的视频,选择何时播放或暂停视频,并指定应播放视频的时间部分。

    Stereo reconstruction employing a layered approach
    38.
    发明授权
    Stereo reconstruction employing a layered approach 失效
    立体声重建采用分层方法

    公开(公告)号:US06348918B1

    公开(公告)日:2002-02-19

    申请号:US09045519

    申请日:1998-03-20

    IPC分类号: G06T1500

    摘要: A system and method for extracting structure from stereo that represents the scene as a collection of planar layers. Each layer optimally has an explicit 3D plane equation, a colored image with per-pixel opacity, and a per-pixel depth value relative to the plane. Initial estimates of the layers are recovered using techniques from parametric motion estimation. The combination of a global model (the plane) with a local correction to it (the per-pixel relative depth value) imposes enough local consistency to allow the recovery of shape in both textured and untextured regions.

    摘要翻译: 一种用于从立体声中提取结构的系统和方法,其将场景表示为平面层的集合。 每个层最佳地具有显式的3D平面方程,具有每像素不透明度的彩色图像和相对于该平面的每像素深度值。 使用参数运动估计的技术来恢复层的初始估计。 全局模型(平面)与其局部校正(每像素相对深度值)的组合施加足够的局部一致性,以允许纹理和非纹理区域中的形状恢复。

    Stereo reconstruction employing a layered approach and layer refinement techniques
    39.
    发明授权
    Stereo reconstruction employing a layered approach and layer refinement techniques 失效
    立体声重建采用分层方法和层次细化技术

    公开(公告)号:US06320978B1

    公开(公告)日:2001-11-20

    申请号:US09045503

    申请日:1998-03-20

    IPC分类号: G06K900

    摘要: A system and method for extracting structure from stereo that represents the scene as a collection of planar layers. Each layer optimally has an explicit 3D plane equation, a colored image with per-pixel opacity, and a per-pixel depth value relative to the plane. Initial estimates of the layers are made and then refined using a re-synthesis step which takes into account both occlusions and mixed pixels. Reasoning about these effects allows the recovery of depth and color information with high accuracy, even in partially occluded regions. Moreover, the combination of a global model (the plane) with a local correction to it (the per-pixel relative depth value) imposes enough local consistency to allow the recovery of shape in both textured and untextured regions.

    摘要翻译: 一种用于从立体声中提取结构的系统和方法,其将场景表示为平面层的集合。 每个层最佳地具有显式的3D平面方程,具有每像素不透明度的彩色图像和相对于该平面的每像素深度值。 使用重新合成步骤对层进行初步估算,然后再次考虑两个遮挡和混合像素。 推论这些效果可以高精度地恢复深度和颜色信息,即使在部分遮挡的区域。 此外,全局模型(平面)与其局部校正(每像素相对深度值)的组合施加足够的局部一致性,以允许纹理和非纹理区域中的形状恢复。

    Inverse texture mapping using weighted pyramid blending and view-dependent weight maps
    40.
    发明授权
    Inverse texture mapping using weighted pyramid blending and view-dependent weight maps 有权
    使用加权金字塔混合和视图相关权重图的逆纹理映射

    公开(公告)号:US06271847B1

    公开(公告)日:2001-08-07

    申请号:US09160898

    申请日:1998-09-25

    IPC分类号: G06T100

    CPC分类号: G06T15/04

    摘要: A system and method for creating weight maps capable of indicating how much each pixel in an image should contribute to a blended image. One such map is a view-dependent weight map created by inputting an image that has been characterized as a collection of regions. A 2D perspective transform is computed for each region that is to be part of the weight map. The transforms are used to warp the associated regions to prescribed coordinates to create a warped image. Once the warped image is created, a Jacobian matrix is computed for each pixel. The determinant of each Jacobian matrix is then computed to establish a weight factor for that pixel. The weight map for the inputted image is created using these computed determinants. Another advantageous weight map is a combination weight map. The process for creating type of weight map is identical to the view-dependant map up to the point the warped image has been created. After that, a first weight factor is computed for each pixel of the warped image using a first weight mapping process. At least one additional weight factor is also computed for each pixel using one or more additional weight mapping processes. The weight factors computed for each pixel are then combined to create a combined weight factor and the weight map is formed from these factors. Preferably, one of the weight mapping processes used to create the combination weight map is the aforementioned view-dependent weight mapping process.

    摘要翻译: 一种用于创建能够指示图像中的每个像素应该对混合图像贡献多少的权重图的系统和方法。 一个这样的地图是通过输入已被表征为区域集合的图像而创建的与视图相关的权重图。 对于要作为权重图的一部分的每个区域,计算2D透视变换。 变换用于将相关联的区域扭曲到规定坐标以产生翘曲图像。 一旦形成了弯曲的图像,就为每个像素计算一个雅可比矩阵。 然后计算每个雅可比矩阵的行列式,以建立该像素的权重因子。 使用这些计算的决定因素创建输入图像的权重图。 另一个有利的权重图是组合权重图。 创建类型的权重映射的过程与直到创建扭曲图像的点的视图相关映射相同。 之后,使用第一权重映射处理来计算扭曲图像的每个像素的第一权重因子。 还使用一个或多个附加权重映射处理为每个像素计算至少一个附加权重因子。 然后将针对每个像素计算的权重因子组合以产生组合权重因子,并且由这些因子形成权重映射。 优选地,用于创建组合权重映射的权重映射过程之一是上述视图相关权重映射处理。