Multi-modal inputs for voice commands

    公开(公告)号:US11462215B2

    公开(公告)日:2022-10-04

    申请号:US16232855

    申请日:2018-12-26

    申请人: Apple Inc.

    摘要: Systems and processes for operating an intelligent automated assistant are provided. In one example process, a first input including activation of an affordance is received. A domain associated with the affordance is determined. A second input including user speech is received, where a user intent is determined based on the domain and the user speech. A determination is made whether the user intent includes a command associated with the affordance. In accordance with a determination that the user intent includes a command associated with the affordance, a task in furtherance of the command is performed.

    Multimodality in digital assistant systems

    公开(公告)号:US11348573B2

    公开(公告)日:2022-05-31

    申请号:US16441461

    申请日:2019-06-14

    申请人: Apple Inc.

    IPC分类号: G10L15/22 G10L15/18 G10L15/30

    摘要: Systems and processes for operating an intelligent automated assistant are provided. An example process for determining user intent includes receiving a natural language input and detecting an event. The process further includes, determining, at a first time, based on the natural language input, a first value for a first node of a parsing structure; and determining, at a second time, based on the detected data event, a second value for a second node of the parsing structure. The process further includes in accordance with a determination that the first time and the second time are within the predetermined time: determining, using the parsing structure, the first value, and the second value, a user intent associated with the natural language input; initiating a task based on the determined intent; and providing an output indicative of the task.

    Reducing the need for manual start/end-pointing and trigger phrases

    公开(公告)号:US11133008B2

    公开(公告)日:2021-09-28

    申请号:US16800456

    申请日:2020-02-25

    申请人: Apple Inc.

    摘要: Systems and processes for selectively processing and responding to a spoken user input are provided. In one example, audio input containing a spoken user input can be received at a user device. The spoken user input can be identified from the audio input by identifying start and end-points of the spoken user input. It can be determined whether or not the spoken user input was intended for a virtual assistant based on contextual information. The determination can be made using a rule-based system or a probabilistic system. If it is determined that the spoken user input was intended for the virtual assistant, the spoken user input can be processed and an appropriate response can be generated. If it is instead determined that the spoken user input was not intended for the virtual assistant, the spoken user input can be ignored and/or no response can be generated.

    Device voice control
    14.
    发明授权

    公开(公告)号:US11127397B2

    公开(公告)日:2021-09-21

    申请号:US16139648

    申请日:2018-09-24

    申请人: Apple Inc.

    IPC分类号: G10L15/22 G06F3/16 G10L15/18

    摘要: Systems and processes for device voice control are provided. An example process includes, at an electronic device, receiving a spoken user input and interpreting the spoken user input to derive a representation of user intent. The process further includes determining whether a task may be identified based on the representation of user intent. In accordance with a determination that a task may be identified based on the representation of user intent, the task is performed, and in accordance with a determination that a task may not be identified based on the representation of user intent, the spoken user input is disambiguated.

    Reducing the need for manual start/end-pointing and trigger phrases

    公开(公告)号:US10373617B2

    公开(公告)日:2019-08-06

    申请号:US15656793

    申请日:2017-07-21

    申请人: Apple Inc.

    摘要: Systems and processes for selectively processing and responding to a spoken user input are provided. In one example, audio input containing a spoken user input can be received at a user device. The spoken user input can be identified from the audio input by identifying start and end-points of the spoken user input. It can be determined whether or not the spoken user input was intended for a virtual assistant based on contextual information. The determination can be made using a rule-based system or a probabilistic system. If it is determined that the spoken user input was intended for the virtual assistant, the spoken user input can be processed and an appropriate response can be generated. If it is instead determined that the spoken user input was not intended for the virtual assistant, the spoken user input can be ignored and/or no response can be generated.