Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Mingshun YANG"

1.

发明申请
SPEECH RECOGNITION AND CODEC METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20230090590A1

公开(公告)日：2023-03-23

申请号：US17738651

申请日：2022-05-06

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xiaoyin FU , Zhijie CHEN , Mingxin LIANG , Mingshun YANG , Lei JIA , Haifeng WANG

IPC: G10L15/02 , G10L15/26 , G10L15/187 , G06F16/683

Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.

2.

发明公开
AUDIO RECOGNITION METHOD, METHOD OF TRAINING AUDIO RECOGNITION MODEL, AND ELECTRONIC DEVICE 审中-公开

公开(公告)号：US20230410794A1

公开(公告)日：2023-12-21

申请号：US18237976

申请日：2023-08-25

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoyin FU , Mingshun YANG , Qiguang ZANG , Zhijie CHEN , Yangkai XU , Guibin WANG , Lei JIA

IPC: G10L15/06 , G10L15/02 , G10L15/26

CPC classification number: G10L15/063 , G10L15/26 , G10L15/02

Abstract: An audio recognition method, a method of training an audio recognition model, and an electronic device are provided, which relate to fields of artificial intelligence, speech recognition, deep learning and natural language processing technologies. The audio recognition method includes: truncating an audio feature of target audio data to obtain at least one first audio sequence feature corresponding to a predetermined duration; obtaining, according to a peak information of the audio feature, a peak sub-information corresponding to the first audio sequence feature; performing at least one decoding operation on the first audio sequence feature to obtain a recognition result for the first audio sequence feature, a number of times the decoding operation is performed being identical to a number of peaks corresponding to the first audio sequence feature; obtaining target text data for the target audio data according to the recognition result for the at least one first audio sequence feature.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification