摘要:
A method for navigating instructional video presentations is disclosed. The method includes determining a pause mode of a video presentation, and playing the video presentation on a display device. The video presentation has one or more predetermined pause positions. The method also includes, while playing the video presentation, determining that the video presentation has reached one of the one or more pause positions. The method further includes, in accordance with a determination that the video presentation is in a first pause mode, pausing the video presentation at the one of the one or more pause positions and maintaining a display of a paused frame of the video presentation, and, in accordance with a determination that the video presentation is in a second pause mode distinct from the first pause mode, continuing to play the video presentation through the one of the one or more pause positions.
摘要:
A method and system for delivery of targeted advertisement via multifunction document imaging devices. Imaging devices used for copying, scanning, faxing and printing documents are used to deliver advertisements, coupons, and other promotional material to users. The imaging device is capable of delivering targeted promotional material based on analysis of the documents content passing through the device. Targeting is based on device history, user history or user demographics. Device history and user history are compiled from the contents of the documents processed respectively at a device and by a user. Demographics are inferred from a demographics model using user identity or document content input to the model. Advertisements may be delivered via paper, the device display, and other means.
摘要:
An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced. A modified audio signal is then synthesized with the original prosodic information and the modified vocal tract transfer function to produce unintelligible speech that preserves the pitch and energy of the speech as well as environmental sounds.
摘要:
Blogs (and other information sources) are recommended to a user based history of user's online activities. The system: (1) processes the user's web history, (2) identifies blog posts (and web pages) that link to pages read by the user, (3) generates multiple relevance scores for each identified post/page, and (4) produces multiple rankings of the corresponding source blogs (and web sites) by aggregating individual relevance scores (or combinations of relevance scores), according to users' preferences. The system allows the discovery of information sources that are likely to be interesting to the user and allows sources lost in the “long tail” to be seamlessly discovered.
摘要:
Provides a system for detecting an intersection between more than one panoramic video sequence and detecting the orientation of the sequences forming the intersection. Video images and corresponding location data are received. If required, the images and location data is processed to ensure the images contain location data. An intersection between two paths is then derived from the video images by deriving a rough intersection between two images, determining a neighborhood for the two images, and dividing each image in the neighborhood into strips. An identifying value is derived from each strip to create a row of strip values which are then converted to the frequency domain. A distance measure is taken between strips in the frequency domain, and the intersection is determined from the images having the smallest distance measure between them. The orientation between the two paths may also be determined in the frequency domain by using the phases of signals representing the images in the Fourier domain or performing a circular cross correlation of two vectors representing the images.
摘要:
A method for navigating instructional video presentations is disclosed. The method includes determining a pause mode of a video presentation, and playing the video presentation on a display device. The video presentation has one or more predetermined pause positions. The method also includes, while playing the video presentation, determining that the video presentation has reached one of the one or more pause positions. The method further includes, in accordance with a determination that the video presentation is in a first pause mode, pausing the video presentation at the one of the one or more pause positions and maintaining a display of a paused frame of the video presentation, and, in accordance with a determination that the video presentation is in a second pause mode distinct from the first pause mode, continuing to play the video presentation through the one of the one or more pause positions.
摘要:
An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced. A modified audio signal is then synthesized with the original prosodic information and the modified vocal tract transfer function to produce unintelligible speech that preserves the pitch and energy of the speech as well as environmental sounds.
摘要:
Systems and methods for interactive, user-driven detection, creation and completion of form fields in a digital document are provided. A document with form fields that require completion by a user is received, after which form fields are detected at the direction of the user. Once the user selects a possible form field, the system creates the appropriate fillable form field based on size, type, location, related text and other parameters of the form field and surrounding document. Additional levels of interaction include predictive text, pattern development and automatic completion of previously completed fields.
摘要:
Described is a system and methods for embedding standard video-taking heuristics into video-recording devices to help improve the quality of captured video for consumer devices. The described approach uses a combination of audio, visual, and haptic feedback that responds to video as it is recorded. This feedback can help users compose better shots as well as help them develop an understanding of the fundamentals of good video-taking.
摘要:
Image search and retrieval system is provided. System identifies pictures embedded in presentation slides. System represents each set of identical (or nearly identical) images with unique token. For example, if specific picture is reused in multiple presentations, it will be represented by system using same token. System may compute and store various meta attributes associated with presentation slide and image(s) therein. After the token and meta attribute information are generated for images and/or slides, generated data is provided to text-based search engine. A searched image is subsequently located and retrieved by user using search query issued by user to text-based search engine, which locates images based on generated token and meta attribute information. At query time, user enters search keywords describing target image that user desires to locate. Pursuant to user's query, system retrieves all matching presentation slides. Found images may be ranked using, for example, tf*idf score.