Sample-efficient adaptive text-to-speech

    公开(公告)号:US11355097B2

    公开(公告)日:2022-06-07

    申请号:US17061437

    申请日:2020-10-01

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers. The adaptive audio-generation model is adapted for a new individual speaker using adaptation data comprising second text and audio data representing the new individual speaker speaking portions of the second text, the new individual speaker being different from each of the plurality of individual speakers, wherein adapting the audio-generation model includes learning a new embedding vector for the new individual speaker.

    Black-box optimization using neural networks

    公开(公告)号:US11354594B2

    公开(公告)日:2022-06-07

    申请号:US16601505

    申请日:2019-10-14

    Abstract: Methods and systems for determining an optimized setting for one or more process parameters of a machine learning training process are described. One of the methods includes processing a current network input using a recurrent neural network in accordance with first values of the network parameters to obtain a current network output, obtaining a measure of the performance of the machine learning training process with an updated setting defined by the current network output, and generating a new network input that includes (i) the updated setting defined by the current network output and (ii) the measure of the performance of the training process with the updated setting defined by the current network output.

    NEURAL PROGRAMMING
    3.
    发明申请
    NEURAL PROGRAMMING 审中-公开

    公开(公告)号:US20200327413A1

    公开(公告)日:2020-10-15

    申请号:US16859811

    申请日:2020-04-27

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.

    Neural programming
    4.
    发明授权

    公开(公告)号:US12260334B2

    公开(公告)日:2025-03-25

    申请号:US18497924

    申请日:2023-10-30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.

    PROGRAMMABLE REINFORCEMENT LEARNING SYSTEMS

    公开(公告)号:US20240394504A1

    公开(公告)日:2024-11-28

    申请号:US18637279

    申请日:2024-04-16

    Abstract: A reinforcement learning system is proposed comprising a plurality of property detector neural networks. Each property detector neural network is arranged to receive data representing an object within an environment, and to generate property data associated with a property of the object. A processor is arranged to receive an instruction indicating a task associated with an object having an associated property, and process the output of the plurality of property detector neural networks based upon the instruction to generate a relevance data item. The relevance data item indicates objects within the environment associated with the task. The processor also generates a plurality of weights based upon the relevance data item, and, based on the weights, generates modified data representing the plurality of objects within the environment. A neural network is arranged to receive the modified data and to output an action associated with the task.

    ADAPTIVE VISUAL SPEECH RECOGNITION
    7.
    发明公开

    公开(公告)号:US20240265911A1

    公开(公告)日:2024-08-08

    申请号:US18571553

    申请日:2022-06-15

    CPC classification number: G10L15/063 G10L25/30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing video data using an adaptive visual speech recognition model. One of the methods includes receiving a video that includes a plurality of video frames that depict a first speaker: obtaining a first embedding characterizing the first speaker; and processing a first input comprising (i) the video and (ii) the first embedding using a visual speech recognition neural network having a plurality of parameters, wherein the visual speech recognition neural network is configured to process the video and the first embedding in accordance with trained values of the parameters to generate a speech recognition output that defines a sequence of one or more words being spoken by the first speaker in the video.

    Neural programming
    8.
    发明授权

    公开(公告)号:US11803746B2

    公开(公告)日:2023-10-31

    申请号:US16859811

    申请日:2020-04-27

    CPC classification number: G06N3/08 G06N3/044 G06N20/00

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.

    VISUAL SPEECH RECOGNITION BY PHONEME PREDICTION

    公开(公告)号:US20210110831A1

    公开(公告)日:2021-04-15

    申请号:US17043846

    申请日:2019-05-20

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual speech recognition. In one aspect, a method comprises receiving a video comprising a plurality of video frames, wherein each video frame depicts a pair of lips; processing the video using a visual speech recognition neural network to generate, for each output position in an output sequence, a respective output score for each token in a vocabulary of possible tokens, wherein the visual speech recognition neural network comprises one or more volumetric convolutional neural network layers and one or more time-aggregation neural network layers; wherein the vocabulary of possible tokens comprises a plurality of phonemes; and determining a sequence of words expressed by the pair of lips depicted in the video using the output scores.

Patent Agency Ranking