SPEECH WAKE-UP METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20240420684A1

    公开(公告)日:2024-12-19

    申请号:US18706313

    申请日:2023-01-17

    Abstract: A speech wake-up method, an electronic device, and a storage medium are provided. The method includes: performing a word recognition on a speech to be recognized to obtain a wake-up word recognition result (S210); performing a syllable recognition on the speech to be recognized to obtain a wake-up syllable recognition result, in response to determining that the wake-up word recognition result represents that the speech to be recognized contains a predetermined wake-up word (S220); and determining that the speech to be recognized is a correct wake-up speech, in response to determining that the wake-up syllable recognition result represents that the speech to be recognized contains a predetermined syllable (S230).

    AUDIO RECOGNITION METHOD, METHOD OF TRAINING AUDIO RECOGNITION MODEL, AND ELECTRONIC DEVICE

    公开(公告)号:US20230410794A1

    公开(公告)日:2023-12-21

    申请号:US18237976

    申请日:2023-08-25

    CPC classification number: G10L15/063 G10L15/26 G10L15/02

    Abstract: An audio recognition method, a method of training an audio recognition model, and an electronic device are provided, which relate to fields of artificial intelligence, speech recognition, deep learning and natural language processing technologies. The audio recognition method includes: truncating an audio feature of target audio data to obtain at least one first audio sequence feature corresponding to a predetermined duration; obtaining, according to a peak information of the audio feature, a peak sub-information corresponding to the first audio sequence feature; performing at least one decoding operation on the first audio sequence feature to obtain a recognition result for the first audio sequence feature, a number of times the decoding operation is performed being identical to a number of peaks corresponding to the first audio sequence feature; obtaining target text data for the target audio data according to the recognition result for the at least one first audio sequence feature.

    METHOD OF PROCESSING SPEECH INFORMATION, METHOD OF TRAINING MODEL, AND WAKE-UP METHOD

    公开(公告)号:US20230360638A1

    公开(公告)日:2023-11-09

    申请号:US18221593

    申请日:2023-07-13

    CPC classification number: G10L15/02 G10L15/14 G10L2015/027

    Abstract: A method of processing a speech information, a method of training a speech model, a speech wake-up method, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to fields of human-computer interaction, deep learning and intelligent speech technologies. A specific implementation solution includes: performing a syllable recognition on a speech information to obtain a posterior probability sequence for the speech information, where the speech information includes a speech frame sequence, the posterior probability sequence corresponds to the speech frame sequence, and each posterior probability in the posterior probability sequence represents a similarity between a syllable in a speech frame matched with the posterior probability and a predetermined syllable; and determining a target peak speech frame from the speech frame sequence based on the posterior probability sequence.

    METHOD AND APPARATUS FOR TRAINING VOICE WAKE-UP MODEL, METHOD AND APPARATUS FOR VOICE WAKE-UP, DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20230317060A1

    公开(公告)日:2023-10-05

    申请号:US18328135

    申请日:2023-06-02

    CPC classification number: G10L15/063 G10L15/02

    Abstract: The present disclosure provides a method and an apparatus for training a voice wake-up model, a method and an apparatus for voice wake-up, a device and a storage medium, which relates to the field of artificial intelligence and particularly to the field of deep learning and voice technology. A specific implementation lies in: acquiring voice recognition training data and voice wake-up training data that are created, and firstly performing training on a base model according to the voice recognition training data to obtain a model parameter of the base model when a model loss function converges; then updating, based on a model configuration instruction, a configuration parameter of a decoding module in the base model to obtain a first model; and finally performing training on the first model according to the voice wake-up training data to obtain a trained voice wake-up model when the model loss function converges.

    METHOD OF TRAINING SPEECH SYNTHESIS MODEL AND METHOD OF SYNTHESIZING SPEECH

    公开(公告)号:US20230178067A1

    公开(公告)日:2023-06-08

    申请号:US18074023

    申请日:2022-12-02

    CPC classification number: G10L13/047 G10L25/30

    Abstract: A method of training a speech synthesis method, a method of synthesizing a speech, a device and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to a field of speech synthesis technology. The specific implementation scheme includes: processing training data by using the speech synthesis model, so as to determine a content encoding sequence, a style encoding sequence, a timbre encoding vector, a noise environment vector and a target Mel spectrum sequence corresponding to the training data; determine a total loss value according to the content encoding sequence, the style encoding sequence, the timbre encoding vector, the noise environment vector and the target Mel spectrum sequence; and adjusting a parameter of the speech synthesis model according to the total loss value.

    METHOD AND APPARATUS FOR COMPRESSING NEURAL NETWORK MODEL

    公开(公告)号:US20230177326A1

    公开(公告)日:2023-06-08

    申请号:US17968688

    申请日:2022-10-18

    CPC classification number: G06N3/08 G06N3/0454

    Abstract: A technical solution for compressing a neural network model which relates to the field of artificial intelligence technologies, such as deep learning technologies, cloud service technologies, is disclosed. The method for compressing a neural network model includes: acquiring a to-be-compressed neural network model; determining a first bit width, a second bit width and a target thinning rate corresponding to the to-be-compressed neural network model; obtaining a target value according to the first bit width, the second bit width and the target thinning rate; and compressing the to-be-compressed neural network model using the target value, the first bit width and the second bit width to obtain a compression result of the to-be-compressed neural network model.

Patent Agency Ranking