Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Lei JIA"

11.

发明申请
SPEECH WAKE-UP METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20240420684A1

公开(公告)日：2024-12-19

申请号：US18706313

申请日：2023-01-17

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Saisai ZOU , Lei JIA , Haifeng WANG

IPC: G10L15/16 , G10L13/02 , G10L15/02 , G10L15/08

Abstract: A speech wake-up method, an electronic device, and a storage medium are provided. The method includes: performing a word recognition on a speech to be recognized to obtain a wake-up word recognition result (S210); performing a syllable recognition on the speech to be recognized to obtain a wake-up syllable recognition result, in response to determining that the wake-up word recognition result represents that the speech to be recognized contains a predetermined wake-up word (S220); and determining that the speech to be recognized is a correct wake-up speech, in response to determining that the wake-up syllable recognition result represents that the speech to be recognized contains a predetermined syllable (S230).

12.

发明公开
AUDIO RECOGNITION METHOD, METHOD OF TRAINING AUDIO RECOGNITION MODEL, AND ELECTRONIC DEVICE 审中-公开

公开(公告)号：US20230410794A1

公开(公告)日：2023-12-21

申请号：US18237976

申请日：2023-08-25

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoyin FU , Mingshun YANG , Qiguang ZANG , Zhijie CHEN , Yangkai XU , Guibin WANG , Lei JIA

IPC: G10L15/06 , G10L15/02 , G10L15/26

CPC classification number: G10L15/063 , G10L15/26 , G10L15/02

Abstract: An audio recognition method, a method of training an audio recognition model, and an electronic device are provided, which relate to fields of artificial intelligence, speech recognition, deep learning and natural language processing technologies. The audio recognition method includes: truncating an audio feature of target audio data to obtain at least one first audio sequence feature corresponding to a predetermined duration; obtaining, according to a peak information of the audio feature, a peak sub-information corresponding to the first audio sequence feature; performing at least one decoding operation on the first audio sequence feature to obtain a recognition result for the first audio sequence feature, a number of times the decoding operation is performed being identical to a number of peaks corresponding to the first audio sequence feature; obtaining target text data for the target audio data according to the recognition result for the at least one first audio sequence feature.

13.

发明公开
METHOD OF PROCESSING SPEECH INFORMATION, METHOD OF TRAINING MODEL, AND WAKE-UP METHOD 审中-公开

公开(公告)号：US20230360638A1

公开(公告)日：2023-11-09

申请号：US18221593

申请日：2023-07-13

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Saisai ZOU , Lei JIA , Haifeng WANG

IPC: G10L15/02 , G10L15/14

CPC classification number: G10L15/02 , G10L15/14 , G10L2015/027

Abstract: A method of processing a speech information, a method of training a speech model, a speech wake-up method, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to fields of human-computer interaction, deep learning and intelligent speech technologies. A specific implementation solution includes: performing a syllable recognition on a speech information to obtain a posterior probability sequence for the speech information, where the speech information includes a speech frame sequence, the posterior probability sequence corresponds to the speech frame sequence, and each posterior probability in the posterior probability sequence represents a similarity between a syllable in a speech frame matched with the posterior probability and a predetermined syllable; and determining a target peak speech frame from the speech frame sequence based on the posterior probability sequence.

14.

发明公开
METHOD AND APPARATUS FOR TRAINING VOICE WAKE-UP MODEL, METHOD AND APPARATUS FOR VOICE WAKE-UP, DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230317060A1

公开(公告)日：2023-10-05

申请号：US18328135

申请日：2023-06-02

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Saisai ZOU , Li CHEN , Ruoxi ZHANG , Lei JIA , Haifeng WANG

IPC: G10L15/06 , G10L15/02

CPC classification number: G10L15/063 , G10L15/02

Abstract: The present disclosure provides a method and an apparatus for training a voice wake-up model, a method and an apparatus for voice wake-up, a device and a storage medium, which relates to the field of artificial intelligence and particularly to the field of deep learning and voice technology. A specific implementation lies in: acquiring voice recognition training data and voice wake-up training data that are created, and firstly performing training on a base model according to the voice recognition training data to obtain a model parameter of the base model when a model loss function converges; then updating, based on a model configuration instruction, a configuration parameter of a decoding module in the base model to obtain a first model; and finally performing training on the first model according to the voice wake-up training data to obtain a trained voice wake-up model when the model loss function converges.

15.

发明公开
METHOD OF TRAINING SPEECH SYNTHESIS MODEL AND METHOD OF SYNTHESIZING SPEECH 审中-公开

公开(公告)号：US20230178067A1

公开(公告)日：2023-06-08

申请号：US18074023

申请日：2022-12-02

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Wenfu WANG , Tao SUN , Xilei WANG , Lei JIA

IPC: G10L13/047 , G10L25/30

CPC classification number: G10L13/047 , G10L25/30

Abstract: A method of training a speech synthesis method, a method of synthesizing a speech, a device and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to a field of speech synthesis technology. The specific implementation scheme includes: processing training data by using the speech synthesis model, so as to determine a content encoding sequence, a style encoding sequence, a timbre encoding vector, a noise environment vector and a target Mel spectrum sequence corresponding to the training data; determine a total loss value according to the content encoding sequence, the style encoding sequence, the timbre encoding vector, the noise environment vector and the target Mel spectrum sequence; and adjusting a parameter of the speech synthesis model according to the total loss value.

16.

发明公开
METHOD AND APPARATUS FOR COMPRESSING NEURAL NETWORK MODEL 审中-公开

公开(公告)号：US20230177326A1

公开(公告)日：2023-06-08

申请号：US17968688

申请日：2022-10-18

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Guibin WANG , Shijun CONG , Hao DONG , Lei JIA

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/0454

Abstract: A technical solution for compressing a neural network model which relates to the field of artificial intelligence technologies, such as deep learning technologies, cloud service technologies, is disclosed. The method for compressing a neural network model includes: acquiring a to-be-compressed neural network model; determining a first bit width, a second bit width and a target thinning rate corresponding to the to-be-compressed neural network model; obtaining a target value according to the first bit width, the second bit width and the target thinning rate; and compressing the to-be-compressed neural network model using the target value, the first bit width and the second bit width to obtain a compression result of the to-be-compressed neural network model.

17.

发明申请
NEURAL NETWORK PROCESSING UNIT, NEURAL NETWORK PROCESSING METHOD AND DEVICE 有权

公开(公告)号：US20220292337A1

公开(公告)日：2022-09-15

申请号：US17832303

申请日：2022-06-03

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Chao TIAN , Lei JIA , Xiaoping YAN , Junhui WEN , Guanglai DENG , Qiang LI

IPC: G06N3/063 , G06F9/30 , G06F9/50 , G06K9/62

Abstract: A neural network processing method, a neural network processing unit (NPU) and a processing device are provided. The method includes: obtaining by a quantizing unit in the NPU float type input data, quantizing the float type input data to obtain quantized input data, and providing the quantized input data to an operation unit; performing by the operation unit of the NPU a matrix-vector operation and/or a convolution operation to the quantized input data to obtain an operation result of the quantized input data; and performing by the quantizing unit inverse quantization to the operation result output by the operation unit to obtain an inverse quantization result.

18.

发明申请
METHOD OF RECOGNIZING SPEECH OFFLINE, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20220108684A1

公开(公告)日：2022-04-07

申请号：US17644749

申请日：2021-12-16

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoyin FU , Mingxin LIANG , Zhijie CHEN , Qiguang ZANG , Zhengxiang JIANG , Liao ZHANG , Qi ZHANG , Lei JIA

IPC: G10L15/02 , G10L15/16 , G10L19/032

Abstract: The present disclosure provides a method of recognizing speech offline, electronic device, and a storage medium, relating to a field of artificial intelligence such as speech recognition, natural language processing, and deep learning. The method may include: decoding speech data to be recognized into a syllable recognition result; transforming the syllable recognition result into a corresponding text as a speech recognition result of the speech data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification