摘要:
Methods, systems, and apparatus for receiving a request that includes a user identifier of a user that submitted a search query and an entity identifier of an entity that is referenced by the search query, identifying a plurality of knowledge elements that are related to the entity, identifying, in a consumption database, one or more items that have been indicated as consumed by the user and that are associated with the entity that is referenced by the search query, assigning rank scores to the plurality of knowledge elements, based at least on identifying the one or more items, selecting one or more of the knowledge elements from among the knowledge elements based at least on the rank scores assigned to the knowledge elements, and providing, in response to the request, information associated with the entity and the one or more selected knowledge elements.
摘要:
A method and system for generating a large-scale database of heterogeneous speech are provided. The method includes transcribing a plurality of multimedia signals retrieved from a large text database and a speech database; randomly selecting a plurality of speech segments from the plurality of multimedia signals, wherein each speech segment of the plurality of speech segments is of a random length; generating a plurality of signatures based on the plurality of speech segments; and populating the large-scale database with the plurality of signatures respective of the plurality of multimedia signals.
摘要:
Confidential information included in image and voice data is filtered in an apparatus that includes an extraction unit for extracting a character string from an image frame, and a conversion unit for converting audio data to a character string. The apparatus also includes a determination unit for determining, in response to contents of a database, whether at least one of the image frame and the audio data include confidential information. The apparatus also includes a masking unit for concealing contents of the image frame by masking the image frame in response to determining that the image frame includes confidential information, and for making the audio data inaudible by masking the audio data in response to determining that the audio data includes confidential information. The playback unit included in the apparatus is for playing back the image frame and the audio data.
摘要:
A computer-implemented method performed in connection with a computerized system incorporating a processing unit and a memory, the computer-implemented method involving: using the processing unit to generate a multi-modal language model for co-occurrence of spoken words and displayed text in the plurality of videos; selecting at least a portion of a first video; extracting a plurality of spoken words from the selected portion of the first video; extracting a first displayed text from the selected portion of the first video; and using the processing unit and the generated multi-modal language model to rank the extracted plurality of spoken words based on probability of occurrence conditioned on the extracted first displayed text.
摘要:
A method for clustering a set of web search results is disclosed. A first signature is compared based at least in part on an analysis of multimedia content associated with a first web search result with a second signature based at least in part on an analysis of multimedia content associated with a second web search result. The first web search result is clustered with the second web search result based at least in part on the comparison of the first signature with the second signature.
摘要:
Methods, systems, and apparatus for receiving a natural language query of a user, and environmental data, identifying a media item based on the environmental data, determining an entity type based on the natural language query, selecting an entity associated with the media item that matches the entity type, selecting, from a media consumption database that identifies media items that have been indicated as consumed by the user, one or more media items that have been indicated as consumed by the user and that are associated with the selected entity, and providing a response to the query based on selecting the one or more media items that have been indicated as consumed by the user and that are associated with the selected entity.
摘要:
Methods and apparatus for detection and identification of duplicate or near-duplicate videos using a perceptual video signature are disclosed. The disclosed apparatus and methods (i) extract perceptual video features, (ii) identify unique and distinguishing perceptual features to generate a perceptual video signature, (iii) compute a perceptual video similarity measure based on the video edit distance, and (iv) search and detect duplicate and near-duplicate videos. A complete framework to detect unauthorized copying of videos on the Internet using the disclosed perceptual video signature is disclosed.
摘要:
A computing system selects a portion of data of an unknown work and detects each event in the portion of data of the unknown work. An event is a perceptual occurrence in a work successively positioned in time. The system determines an event metric between each successive event in the portion of data in the unknown work and generates a list of event metrics between the events for the unknown work. The system compares the list of event metrics for the unknown work to a list of event metrics for a known work and determines the unknown work is a copy of the known work responsive to a match between the list of event metrics of the unknown work and the list of event metrics for the known work.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating search queries in response to obtaining audio samples on a client device. In one aspect, a method includes the actions of i) receiving audio data from a client device, ii) identifying specific content from captured media based on the received audio data, wherein the identified specific content is associated with the received audio data and the captured media includes at least one of audio media or audio-video media, iii) obtaining additional metadata associated with the identified content, iv) generating a search query based at least in part on the obtained additional metadata, and v) returning one or more search results to the client device, the one or more search results responsive to the search query and associated with the received audio data.
摘要:
A content playing apparatus and method are provided, the content playing apparatus including: a receiver which receives content including a video and an audio; a storage unit which stores the received content; a processor which processes the content to play the stored content; an output unit which outputs a video and an audio of the played content; and a controller which generates an index of the video based on properties of the audio, and plays a part of the video corresponding to the properties of the audio at the part of the video by referring to the index.