摘要:
A system and method for identifying query-related keywords in documents found in a search using latent semantic analysis. The documents are represented as a document term matrix M containing one or more document term-weight vectors d, which may be term-frequency (tf) vectors or term-frequency inverse-document-frequency (tf-idf) vectors. This matrix is subjected to a truncated singular value decomposition. The resulting transform matrix U can be used to project a query term-weight vector q into the reduced N-dimensional space, followed by its expansion back into the full vector space using the inverse of U. To perform a search, the similarity of qexpanded is measured relative to each candidate document vector in this space. Exemplary similarity functions are dot product and cosine similarity. Keywords are selected with the highest values in qexpanded that are also comprised in at least one document. Matching keywords from the query may be highlighted in the search results.
摘要:
The invention displays video search results in a form that makes it easy for users to determine which results are truly relevant. Each story returned as a search result is visualized as a collage of keyframes from the story's shots. The selected keyframes and their sizes depend on the corresponding shots' respective relevance. Shot relevance depends on the search retrieval score of the shot and, in some embodiments, also depends on the search retrieval score of the shot's parent story. Once areas have been determined, the keyframes are scaled and/or cropped to fit into the area. In one embodiment, users can mark one or more shots as being relevant to the search. In one embodiment, a timeline is created and displayed with one or more neighbor stories that are each part of the video and which are closest in time of creation to the selected story.
摘要:
In one aspect, the present invention is directed to a method and an apparatus for organizing digital media, particularly digital photos, using face recognition. According to a first aspect of the present invention, a computer-based method for organizing digital photos comprises: extracting objects of interest from a plurality of photographs; cropping said plurality of photographs to generate images of isolated objects of interest; applying a recognition algorithm to determine the similarity of isolated objects of interest with a reference; displaying a plurality of objects arranged as a function of the determined similarity; and receiving user input to associate said objects with a particular classification.
摘要:
A method for generating content links between a first digital file and a second digital file by detecting a content feature of a first digital file segment of the first digital file during playback of the first digital file segment of the first digital file, searching an index of a plurality of content features for a plurality of segments including a second digital file segment of the second digital file, and dynamically generating a link between the first digital file one segment of the first digital file and the second digital file segment of the second digital file when a content feature of the first digital file segment of the first digital file is related to the content feature of the at least one segment of the second digital file.
摘要:
A system helps filter and correct video captured and streamed from a mobile device. In particular, the system detects and streams content shown on screens, allowing anyone to stream screen content immediately without needing to develop hooks into external software (i.e. without installing a screen recorder software in the computer). The system can use a variety of user-selectable techniques to detect the screen, and utilizes the mobile device's touchscreen to allow users to manually override detected corners. However, some of these approaches could potentially be applied to other types of content, such as identifying TV screens, appliance LCD screens, other mobile devices' screens, multifunction devices. (e.g. a remote technician could help troubleshoot a malfunctioning MFD by having the end-user point his cellphone to the LCD screen of the MFD).
摘要:
Embodiments of the present invention include a video server that can detect and track the image of a pointing indicator in an input video stream representation of a computer display. The video server checks ordered frames of the video signal and determines movements for a pointing indicator such as a mouse arrow. Certain motions by the pointing indicator, such as lingering over a button or menu item or circling a button or menu item can provoke a control action on the server.
摘要:
A method for navigating instructional video presentations is disclosed. The method includes determining a pause mode of a video presentation, and playing the video presentation on a display device. The video presentation has one or more predetermined pause positions. The method also includes, while playing the video presentation, determining that the video presentation has reached one of the one or more pause positions. The method further includes, in accordance with a determination that the video presentation is in a first pause mode, pausing the video presentation at the one of the one or more pause positions and maintaining a display of a paused frame of the video presentation, and, in accordance with a determination that the video presentation is in a second pause mode distinct from the first pause mode, continuing to play the video presentation through the one of the one or more pause positions.
摘要:
A method and system for delivery of targeted advertisement via multifunction document imaging devices. Imaging devices used for copying, scanning, faxing and printing documents are used to deliver advertisements, coupons, and other promotional material to users. The imaging device is capable of delivering targeted promotional material based on analysis of the documents content passing through the device. Targeting is based on device history, user history or user demographics. Device history and user history are compiled from the contents of the documents processed respectively at a device and by a user. Demographics are inferred from a demographics model using user identity or document content input to the model. Advertisements may be delivered via paper, the device display, and other means.
摘要:
An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced. A modified audio signal is then synthesized with the original prosodic information and the modified vocal tract transfer function to produce unintelligible speech that preserves the pitch and energy of the speech as well as environmental sounds.