SPEECH SYNTHESIS METHOD AND APPARATUS AND COMPUTER READABLE STORAGE MEDIUM USING THE SAME

    公开(公告)号:US20210193113A1

    公开(公告)日:2021-06-24

    申请号:US17115729

    申请日:2020-12-08

    Abstract: The present disclosure provides a speech synthesis method as well as an apparatus and a computer readable storage medium using the same. The method includes: obtaining a to-be-synthesized text, and extracting to-be-processed Mel spectrum features of the to-be-synthesized text through a preset speech feature extraction algorithm; inputting the to-be-processed Mel spectrum features into a preset ResUnet network model to obtain first intermediate features; performing an average pooling and a first down sampling on the to-be-processed Mel spectrum features to obtain second intermediate features; taking the second intermediate features and the first intermediate features output by the ResUnet network model as an input to perform a deconvolution and a first up sampling so as to obtain target Mel spectrum features corresponding to the to-be-processed Mel spectrum features; and converting the target Mel spectrum features into a target speech corresponding to the to-be-synthesized text.

    METHOD AND DEVICE FOR SYNTHESIZING TALKING HEAD VIDEO AND COMPUTER-READABLE STORAGE MEDIUM

    公开(公告)号:US20240428493A1

    公开(公告)日:2024-12-26

    申请号:US18736552

    申请日:2024-06-07

    Abstract: A method for synthesizing a talking head video includes: obtaining speech data to be synthesized and observation data, wherein the observation data is data obtained through observation other than the speech data; performing feature extraction on the speech data to obtain speech features corresponding to the speech data, and performing feature extraction on the observation data to obtain non-speech features corresponding to the observation data; performing temporal modeling on the speech features and first non-speech features to obtain low-dimensional representations, wherein the first non-speech features are non-speech features that are sensitive to temporal changes; and performing video synthesis based on the low-dimensional representations and second non-speech features, wherein the second non-speech features are non-speech features insensitive to temporal changes.

    Context-based multi-turn dialogue method and storage medium

    公开(公告)号:US11941366B2

    公开(公告)日:2024-03-26

    申请号:US17102395

    申请日:2020-11-23

    CPC classification number: G06F40/35 G06F40/284 G06N3/049

    Abstract: The present disclosure discloses a context-based multi-turn dialogue method. The method includes: obtaining to-be-matched historical dialogue information; performing a word feature extraction based on the to-be-matched historical dialogue information to obtain a historical dialogue word embedding; obtaining candidate answer information; performing the word feature extraction based on the candidate answer information to obtain a candidate answer word embedding; obtaining a historical dialogue partial matching vector and a candidate answer partial matching vector by performing partial semantic relationship matching based on the historical dialogue word embedding and the candidate answer word embedding; obtaining a candidate answer matching probability by performing a matching probability calculation based on the historical dialogue partial matching vector and the candidate answer partial matching vector; and determining matched answer information based on the candidate answer information and the candidate answer matching probability.

    Computer-implemented method for text conversion, computer device, and non-transitory computer readable storage medium

    公开(公告)号:US11645474B2

    公开(公告)日:2023-05-09

    申请号:US17133673

    申请日:2020-12-24

    CPC classification number: G06F40/40 G10L13/08

    Abstract: A computer-implemented method for text conversion, a computer device, and a non-transitory computer readable storage medium are provided. The method includes: obtaining a text to be converted; performing a non-standard word recognition on the text to be converted, to determine whether the text to be converted includes a non-standard word; recognizing the non-standard word in the text to be converted by using an eXtreme Gradient Boosting model in response to the text to be converted including the non-standard word; and obtaining a target converted text corresponding to the text to be converted, according to a recognition result outputted by the eXtreme Gradient Boosting model. The method has a faster recognition speed and a higher recognition accuracy compared with the deep learning model.

    COMPUTER-IMPLEMENTED METHOD FOR TEXT CONVERSION, COMPUTER DEVICE, AND NON-TRANSITORY COMPUTER READABLE STORAGE MEDIUM

    公开(公告)号:US20210200962A1

    公开(公告)日:2021-07-01

    申请号:US17133673

    申请日:2020-12-24

    Abstract: A computer-implemented method for text conversion, a computer device, and a non-transitory computer readable storage medium are provided. The method includes: obtaining a text to be converted; performing a non-standard word recognition on the text to be converted, to determine whether the text to be converted includes a non-standard word; recognizing the non-standard word in the text to be converted by using an eXtreme Gradient Boosting model in response to the text to be converted including the non-standard word; and obtaining a target converted text corresponding to the text to be converted, according to a recognition result outputted by the eXtreme Gradient Boosting model. The method has a faster recognition speed and a higher recognition accuracy compared with the deep learning model.

Patent Agency Ranking