Abstract:
A system captures a first hemispherical image and a second hemispherical image, each hemispherical image including an overlap portion, the overlap potions capturing a same field of view, the two hemispherical images collectively comprising a spherical FOV and separated along a longitudinal plane. The system maps a modified first hemispherical image to a first portion of the 2D projection of a cubic image, the modified first hemispherical image including a non-overlap portion of the first hemispherical image, and maps a modified second hemispherical image to a second portion of the 2D projection of the cubic image, the modified second hemispherical image also including a non-overlap portion. The system maps the overlap portions of the first hemispherical image and the second hemispherical image to the 2D projection of the cubic image, and encodes the 2D projection of the cubic image to generate an encoded image representative of the spherical FOV.
Abstract:
A pair of cameras having an overlapping field of view is aligned based on images captured by image sensors of the pair of cameras. A pixel shift is identified between the images. Based on the identified pixel shift, a calibration is applied to one or both of the pair of cameras. To determine the pixel shift, the camera applies correlation methods including edge matching. Calibrating the pair of cameras may include adjusting a read window on an image sensor. The pixel shift can also be used to determine a time lag, which can be used to synchronize subsequent image captures.
Abstract:
Disclosed is a system and method for generating a model of the geometric relationships between various audio sources recorded by a multi-camera system. The spatial audio scene module associates source signals, extracted from recorded audio, of audio sources to visual objects identified in videos recorded by one or more cameras. This association may be based on estimated positions of the audio sources based on relative signal gains and delays of the source signal received at each microphone. The estimated positions of audio sources are tracked indirectly by tracking the associated visual objects with computer vision. A virtual microphone module may receive a position for a virtual microphone and synthesize a signal corresponding to the virtual microphone position based on the estimated positions of the audio sources.
Abstract:
Use of separate range tone mapping for combined images can help minimize loss of image information in scenes that have drastically different luminance values, i.e., scenes that have both bright and shadowed regions. Separate range tone mapping is particularly useful for combined images, such as those from spherical camera systems, which may have a higher probability of including luminance variability. The resulting increased bit depth of separate range tone mapping can make the transition between different images that make up a combined image more subtle. Each of a plurality of images that make up a combined image can use a different tone map that is optimized for the particular image data of the image. Multiple tone maps that are applied to overlapping regions of the plurality of images can subsequently be combined to expand the bit depth of the overlapping regions.
Abstract:
A pair of cameras having an overlapping field of view is aligned based on images captured by image sensors of the pair of cameras. A pixel shift is identified between the images. Based on the identified pixel shift, a calibration is applied to one or both of the pair of cameras. To determine the pixel shift, the camera applies correlation methods including edge matching. Calibrating the pair of cameras may include adjusting a read window on an image sensor. The pixel shift can also be used to determine a time lag, which can be used to synchronize subsequent image captures.
Abstract:
Multiple cameras are arranged in an array at a pitch, roll, and yaw that allow the cameras to have adjacent fields of view such that each camera is pointed inward relative to the array. The read window of an image sensor of each camera in a multi-camera array can be adjusted to minimize the overlap between adjacent fields of view, to maximize the correlation within the overlapping portions of the fields of view, and to correct for manufacturing and assembly tolerances. Images from cameras in a multi-camera array with adjacent fields of view can be manipulated using low-power warping and cropping techniques, and can be taped together to form a final image.
Abstract:
Apparatus and methods for the stitch zone calculation of a generated projection of a spherical image. In one embodiment, a computing device is disclosed which includes logic configured to: obtain a plurality of images; map the plurality of images onto a spherical image; re-orient the spherical image in accordance with a desired stitch line and a desired projection for the desired stitch line; and map the spherical image to the desired projection having the desired stitch line. In a variant, the desired stitch line is mapped onto an optimal stitch zone, the optimal stitch zone characterized as a set of points that defines a single line on the desired projection in which the set of points along the desired projection lie closest to the spherical image in a mean square sense.
Abstract:
Hyper-hemispherical images may be combined to generate a rectangular projection of a spherical image having an equatorial stitch line along of a line of lowest distortion in the two images. First and second circular images are received representing respective hyper-hemispherical fields of view. A video processing device may project each circular image to a respective rectangular image by mapping an outer edge of the circular image to a first edge of the rectangular image and mapping a center point of the circular image to a second edge of the first rectangular image. The rectangular images may be stitched together along the edges corresponding to the outer edge of the original circular image.
Abstract:
A spherical content capture system captures spherical video and audio content. In one embodiment, captured metadata or video/audio processing is used to identify content relevant to a particular user based on time and location information. The platform can then generate an output video from one or more shared spherical content files relevant to the user. The output video may include a non-spherical reduced field of view such as those commonly associated with conventional camera systems. Particularly, relevant sub-frames having a reduced field of view may be extracted from each frame of spherical video to generate an output video that tracks a particular individual or object of interest. For each sub-frame, a corresponding portion of an audio track is generated that includes a directional audio signal having a directionality based on the selected sub-frame.
Abstract:
Apparatus and methods for the stitch zone calculation of a generated projection of a spherical image. In one embodiment, a computing device is disclosed which includes logic configured to: obtain a plurality of images; map the plurality of images onto a spherical image; re-orient the spherical image in accordance with a desired stitch line and a desired projection for the desired stitch line; and map the spherical image to the desired projection having the desired stitch line. In a variant, the desired stitch line is mapped onto an optimal stitch zone, the optimal stitch zone characterized as a set of points that defines a single line on the desired projection in which the set of points along the desired projection lie closest to the spherical image in a mean square sense.