摘要:
A system and method for compressing digital pen stroke data utilizing curve simplification. Digital pen stroke images (ink images) generate a relatively large amount of data to preserve the ink image generated on a device. Current ink compression algorithms utilize lossless compression algorithm that have limited success. The invention provides a lossy compression algorithm to reduce the amount of data required to store and transmit ink data. The invention utilizes a two-part algorithm to reduce and compress the number of data points representing the ink data. The invention also utilizes curve splines to reconstruct and smooth the lossy ink data image.
摘要:
A mobile device includes an air conduction microphone and an alternative sensor that provides an alternative sensor signal indicative of speech. A communication interface permits the mobile device to communicate directly with other mobile devices.
摘要:
Infrastructure for a multi-modal multilingual communications device (MMCD) is presented. A communications component is provided that includes wireless and wired IP networks (e.g, LANs, MANs, and WANs, . . . ), as well as cellular and/or wired telecommunications networks for cellular communications. A management component can include software and hardware entities that facilitate the activation, authentication, accounting, updating of the MMCD systems, and synchronization to other entities. Additionally, the management component can facilitate the dissemination of applications, third-party services, and subscription information. An access component (e.g., a web server and interface) facilitates access to one or more of these entities such that administrators and/or users can access aspects of setup, configuration, subscriptions, updates, etc.
摘要:
A multi-modal device that can substantially facilitate intelligent shopping. Electronic receipts can be provided to a user wirelessly and stored/indexed on the multi-modal device. Receipts can be categorized (e.g., personal, business, client entertainment) thereby facilitating financial management and accounting. Likewise, such electronic receipts can provide for easier return/exchange of goods. The multi-modal device can also assist in tracking/managing shopping lists and business cards (e.g., provide for business card exchanges). Moreover, the multi-modal device can provide for comparison shopping, catalog shopping, locating products and obtaining more information about a product via visual or audio mechanisms.
摘要:
A system that facilitates managing resources (e.g., functionality, services) based at least in part upon an established context. More particularly, a context determination component can be employed to establish a context by processing sensor inputs or learning/inferring a user action/preference. Once the context is established via context determination component, a power/mode management component can be employed to activate and/or mask resources in accordance with the established context. The power and mode management of the device can extend life of a power source (e.g., battery) and mask functionality in accordance with a user and/or device state.
摘要:
Described herein is a technique for creating a 3D face model using images obtained from an inexpensive camera associated with a general-purpose computer. Two still images of the user are captured, and two video sequences. The user is asked to identify five facial features, which are used to calculate a mask and to perform fitting operations. Based on a comparison of the still images, deformation vectors are applied to a neutral face model to create the 3D model. The video sequences are used to create a texture map. The process of creating the texture map references the previously obtained 3D model to determine poses of the sequential video images.
摘要:
Architecture that combines capture and translation of concepts, goals, needs, locations, objects, locations, and items (e.g., sign text) into complete conversational utterances that take a translation of the item, and morph it with fluidity into sets of sentences that can be echoed to a user, and that the user can select to communicate speech (or textual utterances). A plurality of modalities that process images, audio, video, searches and cultural context, for example, which are representative of at least context and/or content, and can be employed to glean additional information regarding a communications exchange to facilitate more accurate and efficient translation. Gesture recognition can be utilized to enhance input recognition, urgency, and/or emotional interaction, for example. Speech can be used for document annotation. Moreover, translation (e.g., speech to speech, text to speech, speech to text, handwriting to speech, text or audio, . . . ) can be significantly improved in combination with this architecture.
摘要:
Described herein is a technique for creating a 3D face model using images obtained from an inexpensive camera associated with a general-purpose computer. Two still images of the user are captured, and two video sequences. The user is asked to identify five facial features, which are used to calculate a mask and to perform fitting operations. Based on a comparison of the still images, deformation vectors are applied to a neutral face model to create the 3D model. The video sequences are used to create a texture map. The process of creating the texture map references the previously obtained 3D model to determine poses of the sequential video images.
摘要:
A method and apparatus classify a portion of an alternative sensor signal as either containing noise or not containing noise. The portions of the alternative sensor signal that are classified as containing noise are not used to estimate a portion of a clean speech signal and the channel response associated with the alternative sensor. The portions of the alternative sensor signal that are classified as not containing noise are used to estimate a portion of a clean speech signal and the channel response associated with the alternative sensor.
摘要:
Described herein is a technique for creating a 3D face model using images obtained from an inexpensive camera associated with a general-purpose computer. Two still images of the user are captured, and two video sequences. The user is asked to identify five facial features, which are used to calculate a mask and to perform fitting operations. Based on a comparison of the still images, deformation vectors are applied to a neutral face model to create the 3D model. The video sequences are used to create a texture map. The process of creating the texture map references the previously obtained 3D model to determine poses of the sequential video images.