Abstract:
A system and method improves a camera calibration. The method includes receiving a camera image, a planar template pattern, a 3D geometry of a surface on which the planar template pattern is embedded, and a set of parameter values. The method includes rendering the planar template pattern into a camera perspective based on the parameter values to generate a warped template image. The method includes generating an error image including at least one non-zero difference between the camera image and the warped template image. The method includes adjusting the parameter values to reduce an error between the camera image and the warped template image.
Abstract:
Approaches are described for assigning roles to agents in a group of agents engaging in an activity. An assignment analysis system receives a first set of detections, where each detection in the first set of detections comprises a physical location. The assignment analysis system defines an exemplar formation comprising an arrangement of each role in a set of roles. The assignment analysis system calculates a first cost function between at least one detection in the first set of detections and at least one role in the set of roles. The assignment analysis system generates a first set of permutations based on the first cost function. The assignment analysis system assigns a first role in the set of roles to a first detection in the first set of detections based on the first set of permutations.
Abstract:
A system and method generates a broadcast image. The method includes receiving a first image captured by a first camera having a first configuration incorporating a first center of projection. The method includes determining a first mapping from a first image plane of the first camera onto a sphere. The method includes determining a second mapping from the sphere to a second image plane of a second virtual camera having a second configuration incorporating the first center of projection of the first configuration. The method includes generating a second image based upon the first image and a concatenation of the first and second mappings.
Abstract:
To generate a media presentation of a live event, a user interface is coupled to at least three cameras that share substantially the same vantage point. One of the cameras (e.g., a context camera) provides a context view of the event that is displayed on a screen of the user interface. The views of the other two cameras are superimposed onto the context view to define sub-portions that are visually demarcated within the context view. Based on user interaction, the user interface can switch between the cameras views and control the cameras to capture different portions of the context view. Based the image data captured by the views of the cameras within the context view, the user interface generates a media presentation that may be broadcast to multiple viewers.
Abstract:
Techniques are disclosed for controlling robot pixels to display a visual representation of an input. The input to the system could be an image of a face, and the robot pixels deploy in a physical arrangement to display a visual representation of the face, and would change their physical arrangement over time to represent changing facial expressions. The robot pixels function as a display device for a given allocation of robot pixels. Techniques are also disclosed for distributed collision avoidance among multiple non-holonomic robots to guarantee smooth and collision-free motions. The collision avoidance technique works for multiple robots by decoupling path planning and coordination.
Abstract:
Techniques are disclosed for creating digital assets that can be used to personalize themed products. For example, a workflow and pipeline used to generate a 3D model from digital images of a person's face and to manufacture a personalized, physical figurine customized with the 3D model are disclosed. The 3D model of the person's face may be simplified to match a topology of a desired figurine. While the topology is deformed to match that of the figurine, the 3D model retains the geometry of the child's face. Simplifying the topology of the 3D model in this manner allows the mesh to be integrated with or attached to a mesh representing desired figurine.
Abstract:
Speech animation may be performed using visemes with phonetic boundary context. A viseme unit may comprise an animation that simulates lip movement of an animated entity. Individual ones of the viseme units may correspond to one or more complete phonemes and phoneme context of the one or more complete phonemes. Phoneme context may include a phoneme that is adjacent to the one or more complete phonemes that correspond to a given viseme unit. Potential sets of viseme units that correspond with individual phoneme string portions may be determined. One of the potential sets of viseme units may be selected for individual ones of the phoneme string portions based on a fit metric that conveys a match between individual ones of the potential sets and the corresponding phoneme string portion.
Abstract:
There are provided systems and methods for generating a visually consistent alternative audio for redubbing visual speech using a processor configured to sample a dynamic viseme sequence corresponding to a given utterance by a speaker in a video, identify a plurality of phonemes corresponding to the dynamic viseme sequence, construct a graph of the plurality of phonemes that synchronize with a sequence of lip movements of a mouth of the speaker in the dynamic viseme sequence, use the graph to generate an alternative phrase that substantially matches the sequence of lip movements of the mouth of the speaker in the video.
Abstract:
Methods and systems for measuring group behavior are provided. Group behavior of different groups may be measured objectively and automatically in different environments including a dark environment. A uniform visible signal comprising images of members of a group may be obtained. Facial motion and body motions of each member may be detected and analyzed from the signal. Group behavior may be measured by aggregating facial motions and body motions of all members of the group. A facial motion such as a smile may be detected by using the Fourier Lucas-Kanade (FLK) algorithm to register and track faces of each member of a group. A flow-profile for each member of the group is generated. Group behavior may be further analyzed to determine a correlation of the group behavior and the content of the stimulus. A prediction of the general public's response to the stimulus based on the analysis of the group behavior is also provided.
Abstract:
A method is disclosed for reducing distortions introduced by deformation of a surface with an existing parameterization. In one embodiment, the distortions are reduced over a user-specified convex region in texture space ensuring optimization is locally contained in areas of interest. A distortion minimization algorithm is presented that is guided by a user-supplied rigidity map of the specified region. In one embodiment, non-linear optimization is used to calculate the axis-aligned deformation of a non-uniform grid specified over the region's parameter space, so that when the space is remapped from the original to the deformed grid, the distortion of the rigid features is minimized. Since grids require minimal storage and the remapping from one grid to another entails minimal cost, grids can be precalculated for animation sequences and used for real-time texture space remapping that minimizes distortions on specified rigid features.