Patent search ipc:"G10L15/20" Page 1

1.

发明申请
一种用于电网调度的语音识别方法及系统审中-公开

公开(公告)号：WO2023036017A1

公开(公告)日：2023-03-16

申请号：PCT/CN2022/115883

申请日：2022-08-30

Applicant: 广西电网有限责任公司贺州供电局

Inventor： 莫梓樱 , 朱明增 , 覃秋勤 , 吕鸣 , 刘小兰 , 陈极万 , 韩竞 , 李和峰 , 蒋志儒 , 覃景涛 , 黄金 , 卢迎 , 韦晓明 , 李梅 , 周素君 , 梁维 , 罗晨怡 , 梁豪 , 奉华

IPC: G10L15/16 , G10L15/02 , G10L15/26 , G10L25/24 , G10L15/05 , G10L15/06 , G10L15/20 , G10L21/0208 , G06N3/08 , G06Q50/06

Abstract: 一种用于电网调度的语音识别方法及系统，其方法包括：获取电网调度中的原始语音信号；对原始语音信号进行降噪预处理；对降噪预处理的原始语音信号进行快速傅里叶变换FFT；利用梅尔频率倒谱系数MFCC对进行了快速傅里叶变换FFT的原始语音信号进行特征提取；将深度学习神经网络DNN和长短期记忆神经网络LSTM相结合得到组合神经网络DNN-LSTM算法，利用该算法对经特征提取后的原始语音信号进行声学模型训练；利用解码器基于声学模型输出结果、语音模型以及字典寻找出最佳文本输出结果。

2.

发明申请
NOISE REMOVAL ON AN ELECTRONIC DEVICE 审中-公开

公开(公告)号：WO2023277886A1

公开(公告)日：2023-01-05

申请号：PCT/US2021/039662

申请日：2021-06-29

Applicant: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.

Inventor： DA FONTE LOPES DA SILVA, Andre , OZAKI, Carol Tatsuko

IPC: G10L15/20 , G10L21/0232 , G10L25/84

Abstract: In some examples, a method includes initiating an audio session. In some examples, the method includes receiving a classification for the audio session. In some examples, the method includes automatically enabling inbound noise removal when the classification is voice.

3.

发明申请
MULTI-ENCODER END-TO-END AUTOMATIC SPEECH RECOGNITION (ASR) FOR JOINT MODELING OF MULTIPLE INPUT DEVICES 审中-公开

公开(公告)号：WO2022271746A1

公开(公告)日：2022-12-29

申请号：PCT/US2022/034407

申请日：2022-06-21

Applicant: NUANCE COMMUNICATIONS, INC.

Inventor： WENINGER, Felix , GAUDESI, Marco , LEIBOLD, Ralf , ZHAN, Puming

IPC: G10L15/34 , G10L15/26 , G10L15/20 , G10L15/22 , G10L15/04 , G10L19/02 , G10L21/0208 , G10L25/24

Abstract: An end-to-end automatic speech recognition (ASR) system includes: first encoder configured for close-talk input captured by a close-talk input mechanism; second encoder configured for far-talk input captured by far-talk input mechanism; and encoder selection layer configured to select at least one of first and second encoders for use in producing ASR output. The selection is made based on at least one of short-time Fourier transform (STFT), Mel-frequency Cepstral Coefficient (MFCC) and filter bank derived from at least one of the close-talk input and far-talk input. If signals from both the close-talk input mechanism and far-talk input mechanism are present for a speech segment, the encoder selection layer dynamically selects between the close-talk encoder and far-talk encoder to select the encoder that better recognizes the speech segment. An encoder-decoder model is used to produce ASR output.

4.

发明申请
语音交互方法、系统、电子设备及存储介质审中-公开

公开(公告)号：WO2022267405A1

公开(公告)日：2022-12-29

申请号：PCT/CN2021/140759

申请日：2021-12-23

Applicant: 达闼机器人股份有限公司

Inventor： 李翠姣

IPC: G10L15/20 , G10L15/02 , G10L15/06 , G10L15/16 , G10L15/18 , G10L15/063 , G10L15/1822 , G10L15/183 , G10L15/22 , G10L15/26 , G10L2015/223 , G10L2015/225

Abstract: 本申请实施例涉及语音交互技术领域，提出了一种语音交互方法、系统、电子设备及存储介质，语音交互方法包括：获取语音信号经自动语音识别ASR处理后得到的文本信息，其中，语音信号为从环境中获取的声音信号；对文本信息进行特征提取，得到文本信息的特征向量；将特征向量输入训练好的无意义文本识别模型，根据无意义文本识别模型的输出结果判断文本信息是否为无意义文本，其中，无意义文本为不符合常规表达方式的文本；若文本信息不是无意义文本，在利用训练好的应答判断模型检测到需要对文本信息进应答后，对文本信息进行应答。

5.

发明申请
音声認識装置审中-公开

公开(公告)号：WO2022254912A1

公开(公告)日：2022-12-08

申请号：PCT/JP2022/014683

申请日：2022-03-25

Applicant: 株式会社ＮＴＴドコモ

Inventor： 中島　悠輔 , 加藤　拓 , 片山　太一 , 菊入　圭

IPC: G10L15/10 , G10L15/20 , G10L15/22 , G10L15/32

Abstract: 非言語音の認識結果を出力しない音声認識装置を提供することを目的とする。音声認識装置１００は、音声区間検出により得られた所定単位で音声波形信号（音情報）を取得する音声取得部１０１（音情報取得部）と、音情報である音声波形信号が音声である場合には、音声認識処理に基づいた結果を出力し、前記音情報が非言語である場合には、音声認識処理に基づいた結果を出力しないための処理を行う音情報処理部と、を備える。本開示において、この音情報処理部は、音声認識部１０２、非言語音声認識部１０３、スコア判定部１０４、および結果出力部１０５から構成されている。

6.

发明申请
A NEURAL-NETWORK-BASED APPROACH FOR SPEECH DENOISING STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH 审中-公开

公开(公告)号：WO2022107393A1

公开(公告)日：2022-05-27

申请号：PCT/JP2021/027243

申请日：2021-07-20

Applicant: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK , SOFTBANK CORP.

Inventor： ZHENG Changxi , XU Ruilin , WU Rundi , VONDRICK Carl , ISHIWAKA Yuko

IPC: G10L21/0208 , G10L21/0216 , G10L21/0232 , G10L21/0308 , G10L15/04 , G10L15/16 , G10L15/20

Abstract: Disclosed are methods, systems, device, and other implementations, including a method that includes receiving an audio signal representation, detecting in the received audio signal representation, using a first learning model, one or more silent intervals with reduced foreground sound levels, determining based on the detected one or more silent intervals an estimated full noise profile corresponding to the audio signal representation, and generating with a second learning model, based on the received audio signal representation and on the determined estimated full noise profile, a resultant audio signal representation with a reduced noise level.

7.

发明申请
基于管制员指令语义识别的飞机地面引导系统及方法审中-公开

公开(公告)号：WO2021249285A1

公开(公告)日：2021-12-16

申请号：PCT/CN2021/098174

申请日：2021-06-03

Applicant: 中国民航大学

Inventor： 诸葛晶昌

IPC: G10L15/22 , G10L15/20 , G10L15/26 , G10L15/16 , G10L15/06 , G10L15/063

Abstract: 一种基于管制员指令语义识别的飞机地面引导系统及方法。系统包括语义识别模块、路径生成及GIS映射模块和飞机引导终端模块；可提高飞机地面运行安全；无需人工操作航空器引导车，可降低建设、改造、维护和运营成本，适用于机场管制需求，形成高可靠、低故障、经济实用的机场管制决策支持系统和机场飞行区内飞机地面引导系统，实现飞机地面运行安全性的提升。

8.

发明申请
流式端到端语音识别方法、装置及电子设备审中-公开

公开(公告)号：WO2021218843A1

公开(公告)日：2021-11-04

申请号：PCT/CN2021/089556

申请日：2021-04-25

Applicant: 阿里巴巴集团控股有限公司

Inventor： 张仕良 , 高志付

IPC: G10L15/20

Abstract: 一种流式端到端语音识别方法、装置及电子设备，方法包括：以帧为单位对接收到的语音流进行语音声学特征提取并进行编码（S301）；对已完成编码的帧进行分块处理，并对同一分块中包含的需要进行编码输出的激活点数量进行预测（S302）；根据预测结果确定需要进行解码输出的激活点所在的位置，以便解码器在激活点所在的位置进行解码并输出识别结果（S303）。通过本方法能够提升流式端到端语音识别系统对噪声的鲁棒性，进而提升系统性能以及准确度。

9.

发明申请
VERBAL INTERFACE SYSTEMS AND METHODS FOR VERBAL CONTROL OF DIGITAL DEVICES 审中-公开

公开(公告)号：WO2021216679A1

公开(公告)日：2021-10-28

申请号：PCT/US2021/028358

申请日：2021-04-21

Applicant: SAINT LOUIS UNIVERSITY

Inventor： BUCHOLZ, Richard D.

IPC: G06F3/16 , G10L15/08 , G10L15/20 , G10L15/22 , G10L15/26

Abstract: Provided herein are systems and methods to verbally control a host computer for data entry into the computer, such as for electronic medical record data entry. The verbal interface system may be operable to receive, via a connection interface, at least one video capture of the host computer display; analyze the at least one video capture to determine a plurality of data entry fields displayed on the host computer display; receive, via a microphone, a verbal command correlating to at least one of the plurality of data entry fields; and autonomously control the mouse and/or the keyboard to perform mouse and/or text input into the host computer.

10.

发明申请
PRIVACY ENHANCEMENT APPARATUSES FOR USE WITH VOICE-ACTIVATED DEVICES AND ASSISTANTS 审中-公开

公开(公告)号：WO2021211878A1

公开(公告)日：2021-10-21

申请号：PCT/US2021/027532

申请日：2021-04-15

Applicant: SCHWARTZ, Bernard J.

Inventor： SCHWARTZ, Bernard J.

IPC: G10K11/175 , G10L17/22 , G10L15/20 , G10L17/20 , G10L25/78 , G06F3/16

Abstract: Devices for preventing unintended conversation from being recorded by a voice activated assistant device/application (VAD) are disclosed. The device is contoured to fit over a functional surface of a VAD that typically includes a plurality of microphones and control buttons. The device covers the microphones and uses its own microphones to monitor for an authorization input signal. In an embodiment, the devices uses speakers aligned with and opposing each VAD microphone. The device emits interfering audible signals during this mode of operation. Once the device senses an authorization input, the device decouples its speakers from the interfering audible signal and instead allows the device microphones to pass through to the VAD. During this mode, the VAD is in normal operation.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification