IMAGE IN-PAINTING FOR IRREGULAR HOLES USING PARTIAL CONVOLUTIONS

    公开(公告)号:US20190295228A1

    公开(公告)日:2019-09-26

    申请号:US16360895

    申请日:2019-03-21

    Abstract: A neural network architecture is disclosed for performing image in-painting using partial convolution operations. The neural network processes an image and a corresponding mask that identifies holes in the image utilizing partial convolution operations, where the mask is used by the partial convolution operation to zero out coefficients of the convolution kernel corresponding to invalid pixel data for the holes. The mask is updated after each partial convolution operation is performed in an encoder section of the neural network. In one embodiment, the neural network is implemented using an encoder-decoder framework with skip links to forward representations of the features at different sections of the encoder to corresponding sections of the decoder.

    SYNTHESIZING SPEECH IN MULTIPLE LANGUAGES IN CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

    公开(公告)号:US20250118286A1

    公开(公告)日:2025-04-10

    申请号:US18483342

    申请日:2023-10-09

    Abstract: In various examples, synthesizing speech in multiple languages in conversational AI systems and applications is described herein. Systems and methods are disclosed that use one or more models to synthesize speech from a first language spoken by a speaker to a second, target language selected by the speaker. In some examples, to perform the translation, the model(s) may disentangle one or more attributes associated with speech from speakers, such as speakers' identities, speakers' accents, and text associated with the speech. Additionally, the model(s) may allow for fine-grained control of additional attributes associated with output speech, such as one or more frequencies, one or more energies, and one or more phoneme durations. Furthermore, the model(s) may be configured to use the accent associated with the target language when generating text, such as when aligning text encodings with one or more phonemes.

    DIALOGUE SYSTEMS USING KNOWLEDGE BASES AND LANGUAGE MODELS FOR AUTOMOTIVE SYSTEMS AND APPLICATIONS

    公开(公告)号:US20240095460A1

    公开(公告)日:2024-03-21

    申请号:US17947491

    申请日:2022-09-19

    CPC classification number: G06F40/35

    Abstract: In various examples, systems and methods that use dialogue systems associated with various machine systems and applications are described. For instance, the systems and methods may receive text data representing speech, such as a question associated with a vehicle or other machine type. The systems and methods then use a retrieval system(s) to retrieve a question/answer pair(s) associated with the text data and/or contextual information associated with the text data. In some examples, the contextual information is associated with a knowledge base associated with or corresponding to the vehicle. The systems and methods then generate a prompt using the text data, the question/answer pair(s), and/or the contextual information. Additionally, the systems and methods determine, using a language model(s) and based at least on the prompt, an output associated with the text data. For instance, the output may include information that answers the question associated with the vehicle.

Patent Agency Ranking