摘要:
Disclosed herein are methods and systems for generating a user-hair-color model. One embodiment takes the form of a process that includes obtaining video data depicting a head of a user. The process also includes determining a set of line segments of pixels of the video data, wherein each line segment in the determined set of line segments intersects an upper contour of the depicted head of the user. The process also includes grouping at least some of the pixels of at least one of the line segments in the determined set of line segments into three sets of pixels based at least in part on respective color data of the pixels. The three sets of pixels include a skin-pixel set, a hair-pixel set, and a background-pixel set. The process also includes updating a user hair-color model based at least in part on the skin-pixel set.
摘要:
Disclosed herein are methods and systems for identifying background in video data using geometric primitives. One embodiment takes the form of a process that includes obtaining video data depicting at least a portion of a user. The process also includes detecting at least one geometric primitive within the video data. The at least one detected geometric primitive is a type of geometric primitive included in a set of geometric-primitive models. The process also includes identifying a respective region within the video data associated with each detected geometric primitive. The process also includes classifying each respective region as background of the video data.
摘要:
A color image and a depth image of a live video are received. Each of the color image and the depth image are processed to identify the foreground and the background of the live video. The background of the live video is removed in order to create a foreground video that comprises the foreground of the live video. A control input may be received to control the embedding of the foreground video into a second background from a background feed. The background feed may also comprise virtual objects such that the foreground video may interact with the virtual objects.
摘要:
Disclosed herein are methods and systems for classifying pixels as foreground using both short-range depth data and long-range depth data. One embodiment takes the form of a process that includes obtaining video data depicting at least a portion of a user. The process also includes obtaining short-range depth data associated with the video data. The process also includes obtaining long-range depth data associated with the video data. The video data, short-range depth data, and long-range depth data may be obtained via a single 3-D video camera. The process also includes classifying pixels of the video data as foreground based at least in part on both the short-range depth data and the long-range depth data. In some embodiments, classifying pixels of the video data as foreground comprises employing an alpha mask. The alpha mask may comprise binary foreground (hard) indicators. The alpha mask may comprise foreground-likelihood (soft) indicators.
摘要:
Embodiments disclose systems and methods for transmitting user-extracted video and content more efficiently by recognizing that user-extracted video provides the potential to treat parts of a single frame of a user-extracted video differently. An alpha mask of the image part of the user-extracted video is used when encoding the image part so that it retains a higher quality upon transmission than the remainder of the user-extracted video.
摘要:
Disclosed herein are methods and systems for presenting personas according to a common cross-client configuration. An embodiment takes the form of a method that includes extracting a persona from video frames being received from a video camera. The method also includes transmitting an outbound stream of persona data that includes the extracted persona. The method also includes receiving at least one inbound stream of persona data, where the at least one inbound stream of persona data includes one or more other personas. The method also includes presenting a full persona set of the extracted persona and the one or more other personas on a user interface according to a common cross-client persona configuration. The method also includes presenting one or more shared-content channels on the user interface according to a common cross-client shared-content-channel configuration.
摘要:
Disclosed herein are systems and methods for extracting person image data comprising: obtaining at least one frame of pixel data and corresponding image depth data; processing the at least one frame of pixel data and the image depth data with a plurality of persona identification modules to generate a corresponding plurality of persona probability maps; combining the plurality of persona probability maps to obtain an aggregate persona probability map; and generating a persona image by extracting pixels from the at least one frame of pixel data based on the aggregate persona probability map.
摘要:
A system and method is disclosed for extracting a user persona from a video and embedding that persona into a background feed that may have other content, such as text, graphics, or additional video content. The extracted video and background feed are combined to create a composite video that comprises the display in a videoconference. Embodiments cause the user persona to be embedded at preset positions, or in preset formats, or both, depending on the configuration, position, or motion of the user's body.
摘要:
Disclosed herein are methods and systems for assigning pixels distance-cost values using a flood fill technique. One embodiment takes the form of a process that includes obtaining video data depicting a head of a user, obtaining depth data associated with the video data, and selecting seed pixels for a flood fill at least in part by using the depth information. The process also includes performing the flood fill from the selected seed pixels. The flood fill assigns respective distance-cost values to pixels of the video data based on position-space cost values and color-space cost values. In some embodiments, the process also includes classifying pixels of the video data as foreground based at least in part on the assigned distance-cost values. In some other embodiments, the process also includes assigning pixels of the video data foreground-likelihood values based at least in part on the assigned distance-cost values.
摘要:
Disclosed herein are methods and systems for assigning pixels distance-cost values using a flood fill technique. One embodiment takes the form of a process that includes obtaining video data depicting a head of a user, obtaining depth data associated with the video data, and selecting seed pixels for a flood fill at least in part by using the depth information. The process also includes performing the flood fill from the selected seed pixels. The flood fill assigns respective distance-cost values to pixels of the video data based on position-space cost values and color-space cost values. In some embodiments, the process also includes classifying pixels of the video data as foreground based at least in part on the assigned distance-cost values. In some other embodiments, the process also includes assigning pixels of the video data foreground-likelihood values based at least in part on the assigned distance-cost values.