摘要:
A person can use a portable electronic device to electronically purchase or otherwise request a product, service or other deliverable related to audio programming to which the person is listening at the time they initiate the request. The request is fulfilled by a service that analyzes the audio content to identify the deliverable the person desires.
摘要:
The present disclosure provides an interface intelligent interaction control method, apparatus and system, and a storage medium, wherein the method comprises: receiving user-input speech information, and obtaining a speech recognition result; determining scenario elements associated with the speech recognition result; generating an entry corresponding to each scenario element and sending the speech recognition result and the entry to a cloud server; receiving an entry which is best matched with the speech recognition result, returned by the cloud server and selected from the received entries; performing an interface operation corresponding to the best-matched entry. The solution of the present disclosure can be applied to improve flexibility and accuracy of the speech control.
摘要:
Methods, systems, and computer programs are presented for managing audio files of a user to reduce latencies in play start times on local devices. The audio files are stored on cloud storage managed by a server. One method includes processing a plurality of audio files associated with a user, where the processing is configured to create audio snippet files from each of the plurality of audio files. The audio snippet files representing a beginning part of each of the plurality of audio files. The method also includes transmitting the audio snippet files to a client device and detecting a request from the client to begin playing a first audio file from the plurality of audio files of the user. The first audio file being stored on the cloud storage managed by the server.
摘要:
Systems, methods and articles of manufacture for modeling a joint language-visual space. A textual query to be evaluated relative to a video library is received from a requesting entity. The video library contains a plurality of instances of video content. One or more instances of video content from the video library that correspond to the textual query are determined, by analyzing the textual query using a data model that includes a soft-attention neural network module that is jointly trained with a language Long Short-term Memory (LSTM) neural network module and a video LSTM neural network module. At least an indication of the one or more instances of video content is returned to the requesting entity.
摘要:
Methods and systems are provided for managing speech of a speech system. In one embodiment, a method includes: receiving, by a processor, relative information comprising graph data from at least one relative data datasource; processing, by a processor, the graph data of the relative information to determine at least one of an association and a relationship associated with an element defined in the speech system; and storing, by a processor, the at least one of association and relationship as relative slot data for use by at least one of a speech recognition method and a dialog management method.
摘要:
Disclosed is an interaction providing method implemented by a computer, for deleting a query input into a user terminal. The interaction providing method includes: receiving the query from the user terminal; reading the query in response to the receiving of the query input; providing the query and a query result concerning the query to the user terminal in response to the reading of the query such that the query and the query result are output; receiving an input of a swipe command over the query output to the user terminal; and deleting the query and the query result in response to the receiving of the input of the swipe command over the query.
摘要:
A musician discovery system is provided. The musician discovery system includes a first interface for displaying a plurality of musicians organized according to a musical characteristic. The system includes a second interface for presenting multimedia information about a first musician from the plurality of musicians displayed on the first interface. The system includes means for comparing a second plurality of musicians with the first musician using the multimedia information presented on the second interface about the first musician. Furthermore, the system includes a third interface for recommending a second musician from the second plurality of musicians based on the comparing means.
摘要:
Implementations of the present disclosure include actions of providing first text for display on a computing device of a user, the first text being provided from a first speech recognition engine based on first speech received from the computing device, and being displayed as a search query, receiving a speech correction indication from the computing device, the speech correction indication indicating a portion of the first text that is to be corrected, receiving second speech from the computing device, receiving second text from a second speech recognition engine based on the second speech, the second speech recognition engine being different from the first speech recognition engine, replacing the portion of the first text with the second text to provide a combined text, and providing the combined text for display on the computing device as a revised search query.
摘要:
A composite signal having frequencies within a sonic first frequency bandwidth may be received from a communication media on a receiver. The composite signal may include an audio base signal and at least one code signal. The code signal may be encoded with a code, may have a duration shorter than a duration of the base signal, and may have a second frequency bandwidth within the first frequency bandwidth. The composite signal may be output on a speaker, the speaker converting the composite signal into sound. While outputting the composite signal, a signal processing device may detect the output sound corresponding to the code signal. The code may be determined from the detected output sound corresponding to the code signal. Data associated with the code may be retrieved from a data storage device. The retrieved data may be displayed on a display device.
摘要:
When executed, a computer program product generates a graphical user interface that renders results that are responsive to a search query of a rich media file. The graphical user interface includes a chronological representation of the rich media file, one or more occurrence markers along the chronological representation corresponding to actual occurrences of a desired term at an indicated chronological location in the rich media file, and an execution icon configured to launch a rich media application that renders a relevant portion that is responsive to the search query.