MACHINE-LEARNING-BASED SPEECH PRODUCTION CORRECTION

    公开(公告)号:US20240304200A1

    公开(公告)日:2024-09-12

    申请号:US18276171

    申请日:2022-02-08

    CPC classification number: G10L21/007 G10L15/04

    Abstract: A system and method of speech modification may include: receiving a recorded speech, comprising one or more phonemes uttered by a speaker; segmenting the recorded speech to one or more phoneme segments (PS), each representing an uttered phoneme; selecting a phoneme segment (PSk) of the one or more phoneme segments (PS); extracting a portion of the recorded speech, said portion corresponding to a first timeframe ({tilde over (T)}) that comprises the selected phoneme segment; receiving a representation () of a phoneme of interest P*; and applying a machine learning (ML) model on (a) the extracted portion of the recorded speech and (b) on the representation () of the phoneme of interest P*, to generate a modified version of the extracted portion of recorded speech, wherein the phoneme of interest (P*) substitutes the selected phoneme segment (PSk).

Patent Agency Ranking