Abstract:
A method and apparatus for telepresence sharing is described. The method may include providing an indication of a plurality of remote vehicles that are available for telepresence sharing to a user. The method may also include receiving a selection from the user to engage a remote vehicle from the plurality of remote vehicles in a telepresence sharing session. Furthermore, the method may also include providing a live video feed captured by the remote vehicle to a mobile device associated with the user.
Abstract:
A method and apparatus for enabling user query disambiguation based on a user context of a mobile computing device. According to embodiments of the invention, a first user search query, along with sensor data, is received from a mobile computing device. A recognition process is performed on the sensor data to identify at least one item. In response to determining the at least one item is a result for the first search query, data identifying the at least one item is transmitted to the mobile computing device as a response to the first search query. In response to determining the at least one item is not the result for the first search query, search results of a second search query is transmitted to the mobile computing device as the response to the first search query, the second search query comprising a query of the at least one item.
Abstract:
According to an embodiment, a method for filtering feature point matches for visual object recognition is provided. The method includes identifying local descriptors in an image and determining a self-similarity score for each local descriptor based upon matching each local descriptor to its nearest neighbor descriptors from a descriptor dataset. The method also includes filtering feature point matches having a number of local descriptors with self-similarity scores that exceed a threshold. According to another embodiment, the filtering step may further include removing feature point matches. According to a further embodiment, a system for filtering feature point matches for visual object recognition is provided. The system includes a descriptor identifier, a self-similar descriptor analyzer and a self-similar descriptor filter.
Abstract:
A gaze tracking technique is implemented with a head mounted gaze tracking device that communicates with a server. The server receives scene images from the head mounted gaze tracking device which captures external scenes viewed by a user wearing the head mounted device. The server also receives gaze direction information from the head mounted gaze tracking device. The gaze direction information indicates where in the external scenes the user was gazing when viewing the external scenes. An image recognition algorithm is executed on the scene images to identify items within the external scenes viewed by the user. A gazing log tracking the identified items viewed by the user is generated.
Abstract:
Aspects of the invention pertain to matching a selected image/photograph against a database of reference images having location information. The image of interest may include some location information itself, such as latitude/longitude coordinates and orientation. However, the location information provided by a user's device may be inaccurate or incomplete. The image of interest is provided to a front end server, which selects one or more cells to match the image against. Each cell may have multiple images and an index. One or more cell match servers compare the image against specific cells based on information provided by the front end server. An index storage server maintains index data for the cells and provides them to the cell match servers. If a match is found, the front end server identifies the correct location and orientation of the received image, and may correct errors in an estimated location of the user device.
Abstract:
Systems and methods for descriptor vector computation are described herein. An embodiment includes (a) identifying a plurality of regions in the digital image; (b) normalizing the regions using at least a similarity or affine transform such that the normalized regions have the same orientation and size as a pre-determined reference region; (c) generating one or more wavelets using dimensions of the reference region; (d) generating one or more dot products between each of the one or more wavelets, respectively, and the normalized regions; (e) concatenating amplitudes of the one or more dot products to generate a descriptor vector; and (f) outputting a signal corresponding to the descriptor vector.
Abstract:
In one embodiment the present invention is a method for populating and updating a database of images of landmarks including geo-clustering geo-tagged images according to geographic proximity to generate one or more geo-clusters, and visual-clustering the one or more geo-clusters according to image similarity to generate one or more visual clusters. In another embodiment, the present invention is a system for identifying landmarks from digital images, including the following components: a database of geo-tagged images; a landmark database; a geo-clustering module; and a visual clustering module. In other embodiments the present invention may be a method of enhancing user queries to retrieve images of landmarks, or a method of automatically tagging a new digital image with text labels.
Abstract:
The rich media communication system of the present invention provides a user with a three-dimensional communication space or theater having rich media functions. The user may be represented in the theater as a segmented video image or as an avatar. The user is also able to communicate by presenting images, videos, audio files, or text within the theater. The system may include tools for allowing lowered cost of animation, improved collaboration between users, presentation of episodic content, web casts, newscasts, infotainment, advertising, music clips, video conferencing, customer support, distance learning, advertising, social spaces, and interactive game shows and content.
Abstract:
The rich media communication system of the present invention provides a user with a three-dimensional communication space or theater having rich media functions. The user may be represented in the theater as a segmented video image or as an avatar. The user is also able to communicate by presenting images, videos, audio files, or text within the theater. The system may include tools for allowing lowered cost of animation, improved collaboration between users, presentation of episodic content, web casts, newscasts, infotainment, advertising, music clips, video conferencing, customer support, distance learning, advertising, social spaces, and interactive game shows and content.