SYSTEMS AND METHODS FOR GENERATING SYNTHESIZED SPEECH RESPONSES TO VOICE INPUTS

    公开(公告)号:US20240153483A1

    公开(公告)日:2024-05-09

    申请号:US18387211

    申请日:2023-11-06

    申请人: ROVI GUIDES, INC.

    IPC分类号: G10L13/033 G10L25/63

    CPC分类号: G10L13/0335 G10L25/63

    摘要: The system provides a synthesized speech response to a voice input, based on the prosodic character of the voice input. The system receives the voice input and calculates at least one prosodic metric of the voice input. The at least one prosodic metric can be associated with a word, phrase, grouping thereof, or the entire voice input. The system also determines a response to the voice input, which may include the sequence of words that form the response. The system generates the synthesized speech response, by determining prosodic characteristics based on the response, and on the prosodic character of the voice input. The system outputs the synthesized speech response, which includes a more natural, relevant, or both answer to the call of the voice input. The prosodic character of the voice input and/or response may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example.

    NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM, SOUND PROCESSING METHOD, AND SOUND PROCESSING SYSTEM

    公开(公告)号:US20240135916A1

    公开(公告)日:2024-04-25

    申请号:US18483570

    申请日:2023-10-10

    发明人: Makoto TACHIBANA

    IPC分类号: G10L13/033 G10L13/047

    CPC分类号: G10L13/0335 G10L13/047

    摘要: A non-transitory computer-readable recording medium storing a program that, when executed by a computer system, causes the computer system to perform a method including altering a first portion of first time-series data in accordance with an instruction from a user. The first time-series data indicates a time series of a sound characteristic corresponding to a first pronunciation style of a target sound to be synthesized. The method also includes generating second time-series data when a second pronunciation style different from the first pronunciation style is specified for the target sound. The second time-series data indicates a sound characteristic with the alteration made to the first portion in accordance with the instruction from the user, and indicating a sound characteristic with a second portion other than the first portion corresponding to the second pronunciation style.

    Digital assistant voice input integration

    公开(公告)号:US11915696B2

    公开(公告)日:2024-02-27

    申请号:US17379777

    申请日:2021-07-19

    摘要: A digital assistant supported on devices such as smartphones, tablets, personal computers, game consoles, etc. includes an extensibility client that exposes an interface and service that enables third party applications to be integrated with the digital assistant so the application user experiences are rendered using the native voice of the digital assistant. Specific voice inputs associated with a given application may be registered by developers using a manifest that is loaded when the application is launched on the device so that voice inputs from the device user can be mapped by the digital assistant extensibility client to the appropriate application as input events for consumption. In typical implementations, the manifest is arranged as a declarative document that streamlines application development and provides a seamless user experience by enabling customization of third party applications to integrate the digital assistant's voice and behaviors within the user experience of the application's domain.