Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Xiaoyin Fu"

1.

发明授权
Method of recognizing speech offline, electronic device, and storage medium 有权

公开(公告)号：US12183323B2

公开(公告)日：2024-12-31

申请号：US17644749

申请日：2021-12-16

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoyin Fu , Mingxin Liang , Zhijie Chen , Qiguang Zang , Zhengxiang Jiang , Liao Zhang , Qi Zhang , Lei Jia

IPC: G10L15/02 , G10L15/16 , G10L19/032

Abstract: The present disclosure provides a method of recognizing speech offline, electronic device, and a storage medium, relating to a field of artificial intelligence such as speech recognition, natural language processing, and deep learning. The method may include: decoding speech data to be recognized into a syllable recognition result; transforming the syllable recognition result into a corresponding text as a speech recognition result of the speech data.

2.

发明授权
Method for training speech recognition model, device and storage medium 有权

公开(公告)号：US12033616B2

公开(公告)日：2024-07-09

申请号：US17571805

申请日：2022-01-10

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Junyao Shao , Xiaoyin Fu , Qiguang Zang , Zhijie Chen , Mingxin Liang , Huanxin Zheng , Sheng Qian

IPC: G10L15/06 , G10L15/16 , G10L15/183 , G10L15/28

CPC classification number: G10L15/063 , G10L15/16 , G10L15/183 , G10L15/28

Abstract: A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.

3.

发明申请
METHOD FOR TRAINING DATA PROCESSING MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220207427A1

公开(公告)日：2022-06-30

申请号：US17655253

申请日：2022-03-17

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yangkai Xu , Guibin Wang , Xiaoyin Fu , Zhijie Chen , Mingshun Yang , Shijun Cong , Ming Jia , Lei Jia

IPC: G06N20/00

Abstract: A method for training a data processing model includes: acquiring sample data; acquiring an initial data processing model, the initial data processing model including a plurality of forward nodes for outputting a plurality of intermediate results corresponding to the sample data; determining a plurality of time-dependent features corresponding to the plurality of forward nodes; acquiring a data processing model to be trained by processing the initial data processing model based on the plurality of time-dependent features; and training the data processing model to be trained using the sample data and the plurality of intermediate results.

4.

发明授权
Speech recognition and codec method and apparatus, electronic device and storage medium 有权

公开(公告)号：US12183324B2

公开(公告)日：2024-12-31

申请号：US17738651

申请日：2022-05-06

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xiaoyin Fu , Zhijie Chen , Mingxin Liang , Mingshun Yang , Lei Jia , Haifeng Wang

IPC: G10L15/02 , G06F16/683 , G10L15/187 , G10L15/26

Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.

5.

发明授权
Method for training a linguistic model and electronic device 有权

公开(公告)号：US11900918B2

公开(公告)日：2024-02-13

申请号：US17451380

申请日：2021-10-19

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Liao Zhang , Zhengxiang Jiang , Xiaoyin Fu

IPC: G10L15/06 , G06F40/253 , G06F40/30

CPC classification number: G10L15/063 , G06F40/253 , G06F40/30

Abstract: The present disclosure provides a method for training a linguistic model, related to fields of speech, natural language processing, deep learning technologies. A method includes: obtaining grammars corresponding to a plurality of sample texts and a slot value of a slot in each grammar by using semantic analysis; generating a grammar graph corresponding to each grammar based on the corresponding grammar and the slot value of the slot in the corresponding grammar; obtaining a weight of each grammar, a weight of each slot, and a weight of each slot value in each grammar graph based on the sample texts; determining at least one grammar frequency of each order based on the weight of each grammar, the weight of each slot, and the weight of each slot value in each grammar graph; and training the linguistic model based on the at least one grammar frequency of each order.

6.

发明申请
METHOD FOR TRAINING A LINGUISTIC MODEL AND ELECTRONIC DEVICE 有权

公开(公告)号：US20220036880A1

公开(公告)日：2022-02-03

申请号：US17451380

申请日：2021-10-19

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Liao Zhang , Zhengxiang Jiang , Xiaoyin Fu

IPC: G10L15/06 , G06F40/30 , G06F40/253

Abstract: The present disclosure provides a method for training a linguistic model, related to fields of speech, natural language processing, deep learning technologies. A method includes: obtaining grammars corresponding to a plurality of sample texts and a slot value of a slot in each grammar by using semantic analysis; generating a grammar graph corresponding to each grammar based on the corresponding grammar and the slot value of the slot in the corresponding grammar; obtaining a weight of each grammar, a weight of each slot, and a weight of each slot value in each grammar graph based on the sample texts; determining at least one grammar frequency of each order based on the weight of each grammar, the weight of each slot, and the weight of each slot value in each grammar graph; and training the linguistic model based on the at least one grammar frequency of each order.

7.

发明授权
Speech recognition method and apparatus 有权

公开(公告)号：US12067977B2

公开(公告)日：2024-08-20

申请号：US17684681

申请日：2022-03-02

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Liao Zhang , Yinlou Zhao , Zhengxiang Jiang , Xiaoyin Fu , Wei Wei

IPC: G10L15/183 , G06N5/048

CPC classification number: G10L15/183 , G06N5/048

Abstract: The present disclosure discloses a speech recognition method and apparatus, and relates to the field of speech and deep learning technologies. A specific implementation scheme involves: acquiring candidate recognition results with first N recognition scores outputted by a speech recognition model for to-be-recognized speech, N being a positive integer greater than 1; scoring the N candidate recognition results based on pronunciation similarities between candidate recognition results and pre-collected popular entities, to obtain similarity scores of the candidate recognition results; and integrating the recognition scores and the similarity scores of the candidate recognition results to determine a recognition result corresponding to the to-be-recognized speech from the N candidate recognition results. The present disclosure can improve recognition accuracy.

8.

发明授权
Method and apparatus for mining feature information, and electronic device 有权

公开(公告)号：US12067970B2

公开(公告)日：2024-08-20

申请号：US17500188

申请日：2021-10-13

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Jiaxiang Ge , Zhen Wu , Maoren Zhou , Qiguang Zang , Ming Wen , Xiaoyin Fu

IPC: G10L15/02 , G10L15/06 , G10L15/20 , G10L15/22

CPC classification number: G10L15/02 , G10L15/063 , G10L15/20 , G10L15/22

Abstract: A method for mining feature information, an apparatus for mining feature information and an electronic device are disclosed. The method includes: determining a usage scenario of a target device; obtaining raw audio data including real scenario data, speech synthesis data, recorded audio data and other media data; generating target audio data of the usage scenario by simulating the usage scenario based on the raw audio data; and obtaining feature information of the usage scenario by performing feature extraction on the target audio data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification