专利检索 ipc:"G10L25/93" 第 11 页

101.

发明授权
Self-attention-based speech quality measuring method and system for real-time air traffic control 有权

公开(公告)号：US12051440B1

公开(公告)日：2024-07-30

申请号：US18591497

申请日：2024-02-29

申请人： CIVIL AVIATION FLIGHT UNIVERSITY OF CHINA , Weijun Pan

发明人： Weijun Pan , Yidi Wang , Qinghai Zuo , Xuan Wang , Rundong Wang , Tian Luan , Jian Zhang , Zixuan Wang , Peiyuan Jiang , Qianlan Jiang

IPC分类号： G10L25/60 , G08G5/00 , G10L21/0388 , G10L25/18 , G10L25/21 , G10L25/30 , G10L25/93

CPC分类号： G10L25/60 , G08G5/0095 , G10L21/0388 , G10L25/18 , G10L25/21 , G10L25/30 , G10L25/93 , G10L2025/937

摘要： Disclosed are a self-attention-based speech quality measuring method and system for real-time air traffic control, including following steps: acquiring real-time air traffic control speech data and generating speech information frames; detecting the speech information frames, discarding unvoiced information frames of the speech information frames, generating a voiced long speech information frame; performing mel spectrogram conversion, attention extraction and feature fusion on the long speech information frame to obtain a predicted mos value.

102.

发明授权
Acoustic analysis of crowd sounds 有权

公开(公告)号：US11996121B2

公开(公告)日：2024-05-28

申请号：US17644363

申请日：2021-12-15

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Rachel Ostrand , Vagner Figueredo de Santana , Alecio Pedro Delazari Binotto

IPC分类号： G10L25/78 , G06N20/00 , G10L25/21 , G10L25/51 , G10L25/93

CPC分类号： G10L25/78 , G06N20/00 , G10L25/21 , G10L25/51 , G10L25/93 , G10L2025/937

摘要： A method, computer system, and a computer program product for detecting face mask usage based on a crowd sound is provided. The present invention may include capturing an audio stream including a crowd voice data. The present invention may also include analyzing the crowd voice data using a machine learning model to determine an amount of people wearing masks. The present invention may further include in response to determining that the amount of people wearing masks does not meet a compliance threshold, displaying a content to promote face mask usage.

103.

发明授权
Providing translated subtitle for video content 有权

公开(公告)号：US11947924B2

公开(公告)日：2024-04-02

申请号：US18369742

申请日：2023-09-18

申请人： VoyagerX, Inc.

发明人： Hyeonsoo Oh , Sedong Nam

IPC分类号： G10L15/00 , G06F40/47 , G10L15/05 , G10L15/22 , G10L15/26 , G10L25/57 , G10L25/93 , G11B27/34 , H04N21/488 , H04N21/8547

CPC分类号： G06F40/47 , G10L15/05 , G10L15/22 , G10L15/26 , G10L25/57 , G10L25/93 , G11B27/34 , H04N21/4884 , H04N21/8547

摘要： The present disclosure relates to systems and methods for providing subtitle for a video. The video's audio is transcribed to obtain caption text for the video. A first machine-trained model identifies sentences in the caption text. A second model identifies intra-sentence breaks with in the sentences identified using the first machine-trained model. Based on the identified sentences and intra-sentence breaks, one or more words in the caption text are grouped into a clip caption to be displayed for a corresponding clip of the video.

104.

发明公开
INFORMATION PROCESSING APPARATUS, NON-TRANSITORY COMPUTER READABLE MEDIUM, AND INFORMATION PROCESSING METHOD 审中-公开

公开(公告)号：US20240105214A1

公开(公告)日：2024-03-28

申请号：US18158773

申请日：2023-01-24

申请人： FUJIFILM BUSINESS INNOVATION CORP.

发明人： Tsutomu UDAKA , Minoru Akiyama

IPC分类号： G10L25/93 , G10L25/18 , G10L25/51

CPC分类号： G10L25/93 , G10L25/18 , G10L25/51

摘要： An information processing apparatus includes a processor configured to: acquire first data indicative of a temporal change of an intensity of sound emitted by an apparatus; generate second data by extracting, from the first data, a maximum value in each section of a time width corresponding to temporal resolution at which human voice is unrecognizable and discarding values other than the maximum value; and transmit the second data to an external apparatus.

105.

发明公开
CHRONIC PULMONARY DISEASE PREDICTION FROM AUDIO INPUT BASED ON SHORT-WINDED BREATH DETERMINATION USING ARTIFICIAL INTELLIGENCE 审中-公开

公开(公告)号：US20240062902A1

公开(公告)日：2024-02-22

申请号：US18364078

申请日：2023-08-02

申请人： SONY GROUP CORPORATION

发明人： SACHIN KUMAR AGRAWAL , AYUSH KUMAR GHADIYA

IPC分类号： G16H50/20 , G10L25/66 , G10L25/93

CPC分类号： G16H50/20 , G10L25/66 , G10L25/93

摘要： An electronic device and method for chronic pulmonary disease prediction from audio input based on short-winded breath determination using artificial intelligence is disclosed. The electronic device receives an audio input associated with a user. The electronic device applies an Artificial Intelligence (AI) model to detect a short-winded breath duration that corresponds to a time duration between an end of a first spoken word and a start of a second spoken word succeeding the first spoken word. The electronic device detects a speaking pattern. The electronic device applies a Recurrent neural network (RNN) model to reconstruct a set of short-winded breath audio samples. The electronic device generates an audio sample dataset and a set of audio features. The electronic device applies a modular neural network model on the generated audio sample dataset and on the generated set of audio features to determine a set of chronic obstructive pulmonary disease (COPD) metrics.

106.

发明公开
SPEECH-ANALYSIS BASED AUTOMATED PHYSIOLOGICAL AND PATHOLOGICAL ASSESSMENT 审中-公开

公开(公告)号：US20240057936A1

公开(公告)日：2024-02-22

申请号：US18271416

申请日：2022-01-12

申请人： Hoffmann-La Roche Inc. , Universitätsspital Basel

发明人： Martin Christian STRAHM , Yan-Ping ZHANG , Qian ZHOU

IPC分类号： A61B5/00 , G10L25/18 , G10L25/21 , G10L25/66 , G10L25/93 , G10L25/90 , G10L15/05

CPC分类号： A61B5/4803 , G10L25/18 , G10L25/21 , G10L25/66 , G10L25/93 , G10L25/90 , G10L15/05 , G10L2025/937

摘要： Methods of assessing the pathological and/or physiological state of a subject, methods of monitoring a subject with heart failure or a subject that has been diagnosed as having or being at risk of having a condition associated with dyspnea and/or fatigue, and methods of diagnosing a subject as having decompensated heart failure are provided. The methods comprise obtaining a voice recording from a word-reading test from the subject, wherein the voice recording is from a word-reading test comprising reading a sequence of words drawn from a set of n words and analysing the voice recording, or a portion thereof. The analysing can comprise identifying a plurality of segments of the voice recording that correspond to single words or syllables; determining the value of one or more metrics selected from the breathing %, unvoicing/voicing ratio, voice pitch and correct word rate at least in part based on the identified segments; and comparing the value of the one or more metrics with one or more respective reference values. Related systems and products are also described.

107.

发明授权
Voice/non-voice determination device, voice/non-voice determination model parameter learning device, voice/non-voice determination method, voice/non-voice determination model parameter learning method, and program 有权

公开(公告)号：US11894017B2

公开(公告)日：2024-02-06

申请号：US17628467

申请日：2019-07-25

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Ryo Masumura , Takanobu Oba , Kiyoaki Matsui

IPC分类号： G10L25/93 , G10L25/78 , G10L15/00 , G10L15/02 , G10L21/0208 , G06N20/20 , G06N3/044 , G06N3/09 , G10L17/00 , G10L25/84

CPC分类号： G10L25/93 , G06N20/20 , G10L15/00 , G10L15/02 , G10L21/0208 , G10L25/78 , G06N3/044 , G06N3/09 , G10L17/00 , G10L25/84 , G10L2015/025

摘要： A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.

108.

发明授权
Concept for encoding an audio signal and decoding an audio signal using speech related spectral shaping information 有权

公开(公告)号：US11881228B2

公开(公告)日：2024-01-23

申请号：US17121179

申请日：2020-12-14

申请人： Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

发明人： Guillaume Fuchs , Markus Multrus , Emmanuel Ravelli , Markus Schnell

IPC分类号： G10L19/12 , G10L19/06 , G10L19/07 , G10L19/083 , G10L19/20 , G10L19/032 , G10L25/93 , G10L19/00

CPC分类号： G10L19/12 , G10L19/032 , G10L19/06 , G10L19/07 , G10L19/083 , G10L19/20 , G10L25/93 , G10L2019/0016

摘要： According to an aspect of the present invention an encoder for encoding an audio signal has an analyzer configured for deriving prediction coefficients and a residual signal from a frame of the audio signal. The encoder has a formant information calculator configured for calculating a speech related spectral shaping information from the prediction coefficients, a gain parameter calculator configured for calculating a gain parameter from an unvoiced residual signal and the spectral shaping information and a bitstream former configured for forming an output signal based on an information related to a voiced signal frame, the gain parameter or a quantized gain parameter and the prediction coefficients.

109.

发明授权
Health monitoring system and appliance 有权

公开(公告)号：US11881221B2

公开(公告)日：2024-01-23

申请号：US17810032

申请日：2022-06-30

申请人： The Notebook, LLC

发明人： Karen Elaine Khaleghi

IPC分类号： G10L15/22 , G10L25/90 , G10L25/24 , G10L25/78 , G10L15/18 , G10L25/66 , G10L25/93 , A61B5/00 , B60K28/06 , A61B5/16 , G06V40/16

CPC分类号： G10L15/22 , A61B5/165 , A61B5/4803 , B60K28/06 , G06V40/165 , G06V40/167 , G06V40/171 , G06V40/174 , G10L15/1815 , G10L15/1822 , G10L25/24 , G10L25/66 , G10L25/78 , G10L25/90 , G10L25/93 , G10L2015/223 , G10L2015/227

摘要： Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if any of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Using the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.

110.

发明授权
Information processing system, information processing apparatus, control method for information processing apparatus, and program 有权

公开(公告)号：US11880633B2

公开(公告)日：2024-01-23

申请号：US17602870

申请日：2020-04-14

申请人： Sony Interactive Entertainment Inc.

发明人： Toru Ogiso

IPC分类号： G06F3/16 , G10L25/93

CPC分类号： G06F3/165 , G10L25/93 , G10L2025/935

摘要： An information processing apparatus is connected to a peripheral apparatus that includes sound inputting means for outputting a sound signal representative of sound of surroundings. The information processing apparatus performs control such that, in a case where sound input is required in processing of an application determined in advance, in a state in which a sound signal accepted from the peripheral apparatus is cut off, the sound signal accepted from the peripheral apparatus is accepted and the sound signal is used only in the processing of the application determined in advance.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类