Abstract:
Systems and processes are disclosed for handling a multi-part voice command for a virtual assistant. Speech input can be received from a user that includes multiple actionable commands within a single utterance. A text string can be generated from the speech input using a speech transcription process. The text string can be parsed into multiple candidate substrings based on domain keywords, imperative verbs, predetermined substring lengths, or the like. For each candidate substring, a probability can be determined indicating whether the candidate substring corresponds to an actionable command. Such probabilities can be determined based on semantic coherence, similarity to user request templates, querying services to determine manageability, or the like. If the probabilities exceed a threshold, the user intent of each substring can be determined, processes associated with the user intents can be executed, and an acknowledgment can be provided to the user.
Abstract:
An electronic device with one or more processors and memory includes a procedure for enabling conversation persistence across two or more instances of a digital assistant. In some embodiments, the device displays a first dialogue in a first instance of a digital assistant user interface. In response to a request to display a user interface different from the digital assistant user interface, the device displays the user interface different from the digital assistant user interface. In response to a request to invoke the digital assistant, the device displays a second instance of the digital assistant user interface, including displaying a second dialogue in the second instance of the digital assistant user interface, where the first dialogue remains available for display in the second instance of the digital assistant user interface.
Abstract:
The method includes automatically, without user input and without regard to whether a digital assistant application has been separately invoked by a user, determining that the electronic device is in a vehicle. In some implementations, determining that the electronic device is in a vehicle comprises detecting that the electronic device is in communication with the vehicle (e.g., via a wired or wireless communication techniques and/or protocols). The method also includes, responsive to the determining, invoking a listening mode of a virtual assistant implemented by the electronic device. In some implementations, the method also includes limiting the ability of a user to view visual output presented by the electronic device, provide typed input to the electronic device, and the like.
Abstract:
Systems and processes are disclosed for controlling television user interactions using a virtual assistant. A virtual assistant can interact with a television set-top box to control content shown on a television. Speech input for the virtual assistant can be received from a device with a microphone. User intent can be determined from the speech input, and the virtual assistant can execute tasks according to the user's intent, including causing playback of media on the television. Virtual assistant interactions can be shown on the television in interfaces that expand or contract to occupy a minimal amount of space while conveying desired information. Multiple devices associated with multiple displays can be used to determine user intent from speech input as well as to convey information to users. In some examples, virtual assistant query suggestions can be provided to the user based on media content shown on a display.
Abstract:
An electronic device receives a first input that corresponds to a request to open a respective application, and in response to receiving the first input, in accordance with a determination that the device is being operated in a limited-distraction context, provides a limited-distraction user interface that includes providing for display fewer selectable user interface objects than are displayed in a non-limited user interface for the respective application, and in accordance with a determination that the device is not being operated in a limited-distraction context, provides a non-limited user interface for the respective application.
Abstract:
Systems and processes for operating an automated assistant are disclosed. In one example process, an electronic device provides an audio output via a speaker of the electronic device. While providing the audio output, the electronic device receives, via a microphone of the electronic device, a natural language speech input. The electronic device derives a representation of user intent based on the natural language speech input and the audio output, identifies a task based on the derived user intent; and performs the identified task.
Abstract:
Systems and processes for operating a digital assistant are provided. In one example, a method includes receiving a first speech input from a user. The method further includes identifying context information and determining a user intent based on the first speech input and the context information. The method further includes determining whether the user intent is to perform a task using a searching process or an object managing process. The searching process is configured to search data, and the object managing process is configured to manage objects. The method further includes, in accordance with a determination the user intent is to perform the task using the searching process, performing the task using the searching process; and in accordance with the determination that the user intent is to perform the task using the object managing process, performing the task using the object managing process.
Abstract:
A method of operating a digital assistant to provide emergency call functionality is provided. In some embodiments, the method is performed at a device including one or more processors and memory storing instructions for execution by the one or more processors. The method includes receiving a speech input from a user, determining whether the speech input expresses a user request for making an emergency call, and determining a local emergency dispatcher telephone number based on a geographic location of the device. The method also includes, in response to determining or obtaining a determination that the speech input expresses a user request for making an emergency call, calling the local emergency dispatcher telephone number using the emergency call functionality.
Abstract:
A user interface for a system such as a virtual assistant is automatically adapted for hands-free use. A hands-free context is detected via automatic or manual means, and the system adapts various stages of a complex interactive system to modify the user experience to reflect the particular limitations of such a context. The system of the present invention thus allows for a single implementation of a complex system such as a virtual assistant to dynamically offer user interface elements and alter user interface behavior to allow hands-free use without compromising the user experience of the same system for hands-on use.