Abstract:
Three-dimensional (3-D) spatial image data may be received that is associated with at least one arm motion of an actor based on free-form movements of at least one hand of the actor, based on natural gesture motions of the at least one hand. A plurality of sequential 3-D spatial representations that each include 3-D spatial map data corresponding to a 3-D posture and position of the hand at sequential instances of time during the free-form movements may be determined, based on the received 3-D spatial image data. An integrated 3-D model may be generated, via a spatial object processor, based on incrementally integrating the 3-D spatial map data included in the determined sequential 3-D spatial representations and comparing a threshold time value with model time values indicating numbers of instances of time spent by the hand occupying a plurality of 3-D spatial regions during the free-form movements.
Abstract:
Architecture that combines multiple depth cameras and multiple projectors to cover a specified space (e.g., a room). The cameras and projectors are calibrated, allowing the development of a multi-dimensional (e.g., 3D) model of the objects in the space, as well as the ability to project graphics in a controlled fashion on the same objects. The architecture incorporates the depth data from all depth cameras, as well as color information, into a unified multi-dimensional model in combination with calibrated projectors. In order to provide visual continuity when transferring objects between different locations in the space, the user's body can provide a canvas on which to project this interaction. As the user moves body parts in the space, without any other object, the body parts can serve as temporary “screens” for “in-transit” data.
Abstract:
Systems and methods related to engaging with a virtual assistant via ancillary input are provided. Ancillary input may refer to non-verbal, non-tactile input based on eye-gaze data and/or eye-gaze attributes, including but not limited to, facial recognition data, motion or gesture detection, eye-contact data, head-pose or head-position data, and the like. Thus, to initiate and/or maintain interaction with a virtual assistant, a user need not articulate an attention word or words. Rather the user may initiate and/or maintain interaction with a virtual assistant more naturally and may even include the virtual assistant in a human conversation with multiple speakers. The virtual assistant engagement system may utilize at least one machine-learning algorithm to more accurately determine whether a user desires to engage with and/or maintain interaction with a virtual assistant. Various hardware configurations associated with a virtual assistant device may allow for both near-field and/or far-field engagement.
Abstract:
Aspects relate to detecting gestures that relate to a desired action, wherein the detected gestures are common across users and/or devices within a surface computing environment. Inferred intentions and goals based on context, history, affordances, and objects are employed to interpret gestures. Where there is uncertainty in intention of the gestures for a single device or across multiple devices, independent or coordinated communication of uncertainty or engagement of users through signaling and/or information gathering can occur.
Abstract:
Systems and methods related to engaging with a virtual assistant via ancillary input are provided. Ancillary input may refer to non-verbal, non-tactile input based on eye-gaze data and/or eye-gaze attributes, including but not limited to, facial recognition data, motion or gesture detection, eye-contact data, head-pose or head-position data, and the like. Thus, to initiate and/or maintain interaction with a virtual assistant, a user need not articulate an attention word or words. Rather the user may initiate and/or maintain interaction with a virtual assistant more naturally and may even include the virtual assistant in a human conversation with multiple speakers. The virtual assistant engagement system may utilize at least one machine-learning algorithm to more accurately determine whether a user desires to engage with and/or maintain interaction with a virtual assistant. Various hardware configurations associated with a virtual assistant device may allow for both near-field and/or far-field engagement.