专利检索 ipc:G10L25/45 第 9 页

81.

发明申请
METHOD AND SYSTEM FOR SPEECH SEPARATION 有权

公开(公告)号：US20220172735A1

公开(公告)日：2022-06-02

申请号：US17436050

申请日：2019-03-07

申请人： HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED

发明人： Xiangru BI , Qingshan ZHANG

IPC分类号： G10L21/0272 , G10L25/45 , G10L25/87 , G10L25/90

摘要： The present disclosure is directed to a speech separation method and system using a sliding window. The method comprises: acquiring at least one speech from at least one user by at least one microphone and storing the at least one speech as a speech signal in a sound recording module; extracting the speech signal from the sound recording module and processing the extracted speech signal through a sliding window; and transmitting the processed speech signal to a Degenerate Unmixing Estimation Technique (DUET) module for speech separation.

82.

发明申请
METHOD FOR CHANGING SPEED AND PITCH OF SPEECH AND SPEECH SYNTHESIS SYSTEM 有权

公开(公告)号：US20220165250A1

公开(公告)日：2022-05-26

申请号：US17380426

申请日：2021-07-20

申请人： Xinapse Co., Ltd.

发明人： Jinbeom Kang , Dong Won Joo , Yongwook Nam , Seung Jae Lee

IPC分类号： G10L13/033 , G10L21/013 , G10L21/14 , G10L21/043 , G10L25/45

摘要： This application relates to a method of synthesizing a speech of which a speed and a pitch are changed. In one aspect, the method includes a spectrogram may be generated by performing a short-time Fourier transformation on a first speech signal based on a first hop length and a first window length, and speech signals of sections having a second window length at the interval of a second hop length from the spectrogram. A ratio between the first hop length and the second hop length may be set to be equal to the value of a playback rate and a ratio between the first window length and the second window length may be set to be equal to the value of a pitch change rate, thereby generating a second speech signal of which the speed and the pitch are changed.

83.

发明授权
Automated clinical documentation system and method 有权

公开(公告)号：US11295272B2

公开(公告)日：2022-04-05

申请号：US16271329

申请日：2019-02-08

申请人： Nuance Communications, Inc.

发明人： Daniel Paulino Almendro Barreda , Dushyant Sharma , Joel Praveen Pinto , Uwe Helmut Jost , Patrick A. Naylor

IPC分类号： G06Q10/10 , G10L25/51 , G06F3/16 , G16H15/00 , G16H10/20 , G16H10/60 , G06F40/117 , G10L25/45 , G06T7/20 , H04R1/40 , H04R3/00 , G10L15/22 , G10L15/30 , G16H10/40 , G16H50/70 , G10L15/26 , G06F40/30

摘要： A method, computer program product, and computing system for obtaining encounter information of a patient encounter, wherein the encounter information includes machine vision encounter information; and processing the encounter information to generate an encounter transcript.

84.

发明授权
Methods and vehicles for capturing emotion of a human driver and customizing vehicle response 有权

公开(公告)号：US11270699B2

公开(公告)日：2022-03-08

申请号：US16732069

申请日：2019-12-31

申请人： Emerging Automotive, LLC

发明人： Angel A. Penilla , Albert S. Penilla

IPC分类号： G10L15/22 , B60R16/037 , G01C21/36 , G06K9/00 , G10L15/06 , G10L17/04 , G10L17/06 , G10L15/25 , G10L21/0208 , G10L25/45 , G10L25/57 , H04L67/306 , H04L67/1097 , G10L25/63 , H04L67/12 , G10L15/30 , G10L15/00 , G10L15/02 , G10L25/84 , H04L67/10 , G06F3/01 , G10L25/90

摘要： Methods and systems for determining an emotion of a human driver of a vehicle and using the emotion for generating a vehicle response, is provided. One example method includes capturing, by a camera of the vehicle, a face of the human driver. The capturing is configured to capture a plurality of images over a period of time, and the plurality of images are analyzed to identify a facial expression and changes in the facial expression of the human driver over the period of time. The method further includes capturing, by a microphone of the vehicle, voice input of the human driver. The voice input is captured over the period of time. The voice input is analyzed to identify a voice profile and changes in the voice profile of the human driver over the period of time. The method processes, by a processor of the vehicle, a combination of the facial expression and the voice profile captured during the period of time to predict the emotion of the human driver. The method generates the vehicle response that is responsive to the emotion of the human driver. The vehicle response is configured to make at least one adjustment to a setting of the vehicle. The adjustment is selected based on the emotion of the human driver. The vehicle response can be used to make the driver more calm and/or assist in reducing distracted driving. The prediction of the emotion may be additionally increased by capturing and analyzing touch and/or gesture characteristic of the human driver when interfacing with a graphical user interface or surfaces of the vehicle or systems of the vehicle.

85.

发明授权
Robust detection of impulsive acoustic event onsets in an audio stream 有权

公开(公告)号：US11133023B1

公开(公告)日：2021-09-28

申请号：US17197539

申请日：2021-03-10

申请人： V5 Systems, Inc.

发明人： Will Hedgecock

IPC分类号： G10L25/51 , G10L25/18 , G10L25/45

摘要： This disclosure sets forth a system for detecting and determining the onset times of one or more impulsive acoustic events across multiple channels of audio. Audio is segmented into chunks of predefined length and then processed for detecting acoustic onsets, including cross-validating and refining the estimated acoustic onsets to the level of an audio sample. The output of the system is a list of channel-specific timestamped indices corresponding to the audio samples of the onsets of each impulsive acoustic event in the current segment of audio.

86.

发明申请
Estimating Lung Volume by Speech Analysis 有权

公开(公告)号：US20210056983A1

公开(公告)日：2021-02-25

申请号：US17074653

申请日：2020-10-20

申请人： Cordio Medical Ltd.

发明人： Ilan D. Shallom

IPC分类号： G10L25/66 , G06F17/18 , G10L25/45

摘要： Described embodiments include an apparatus that includes a network interface and a processor. The processor is configured to receive, via the network interface, a speech signal that represents speech uttered by a subject, the speech including one or more speech segments, divide the speech signal into multiple frames, such that one or more sequences of the frames represent the speech segments, respectively, compute respective estimated total volumes of air exhaled by the subject while the speech segments were uttered, by, for each of the sequences, computing respective estimated flow rates of air exhaled by the subject during the frames belonging to the sequence and, based on the estimated flow rates, computing a respective one of the estimated total volumes of air, and, in response to the estimated total volumes of air, generate an alert. Other embodiments are also described.

87.

发明授权
Audio coder window sizes and time-frequency transformations 有权

公开(公告)号：US10818305B2

公开(公告)日：2020-10-27

申请号：US15967119

申请日：2018-04-30

申请人： DTS, Inc.

发明人： Michael M. Goodwin , Antonius Kalker , Albert Chau

IPC分类号： G10L19/022 , G10L19/008 , G10L19/26 , G10L25/18 , G10L25/45 , G10L19/02 , G10L19/22

摘要： A method of encoding an audio signal is provided comprising: applying multiple different time-frequency transformations to an audio signal frame; computing measures of coding efficiency across multiple frequency bands for multiple time-frequency resolutions; selecting a combination of time-frequency resolutions to represent the frame at each of the multiple frequency bands based at least in part upon the computed measures of coding efficiency; determining a window size and a corresponding transform size; determining a modification transformation; windowing the frame using the determined window size; transforming the windowed frame using the determined transform size; modifying a time-frequency resolution within a frequency band of the transform of the windowed frame using the determined modification transformation.

88.

发明授权
Automated clinical documentation system and method 有权

公开(公告)号：US10809970B2

公开(公告)日：2020-10-20

申请号：US16271616

申请日：2019-02-08

申请人： Nuance Communications, Inc.

发明人： Daniel Paulino Almendro Barreda , Dushyant Sharma , Joel Praveen Pinto , Uwe Helmut Jost , Patrick A. Naylor

IPC分类号： G06F3/16 , G16H10/60 , G16H15/00 , H04R3/00 , H04R1/40 , G10L25/51 , G06Q10/10 , G10L15/26 , G10L25/45 , G16H10/20 , G06T7/20 , G10L15/22 , G10L15/30 , G16H10/40 , G16H50/70 , G06F40/20 , G06F40/30 , G06F40/117

摘要： A method, computer program product, and computing system for obtaining encounter information of a patient encounter, wherein the encounter information includes machine vision encounter information; processing the machine vision encounter information to identify one or more humanoid shapes; and steering one or more audio recording beams toward the one or more humanoid shapes to capture audio encounter information.

89.

发明授权
System and method for neural network based speaker classification 有权

公开(公告)号：US10755718B2

公开(公告)日：2020-08-25

申请号：US15835318

申请日：2017-12-07

申请人： INTERACTIVE INTELLIGENCE GROUP, INC.

发明人： Zhenhao Ge , Ananth N. Iyer , Srinath Cheluvaraja , Ram Sundaram , Aravind Ganapathiraju

IPC分类号： G10L17/00 , G10L17/04 , G10L25/45 , G10L17/18 , G10L25/93

摘要： A method for classifying speakers includes: receiving, by a speaker recognition system including a processor and memory, input audio including speech from a speaker; extracting, by the speaker recognition system, a plurality of speech frames containing voiced speech from the input audio; computing, by the speaker recognition system, a plurality of features for each of the speech frames of the input audio; computing, by the speaker recognition system, a plurality of recognition scores for the plurality of features; computing, by the speaker recognition system, a speaker classification result in accordance with the recognition scores; and outputting, by the speaker recognition system, the speaker classification result.

90.

发明授权
Method and apparatus for controlling audio frame loss concealment 有权

公开(公告)号：US10559314B2

公开(公告)日：2020-02-11

申请号：US16407307

申请日：2019-05-09

申请人： Telefonaktiebolaget L M Ericsson (publ)

发明人： Stefan Bruhn , Jonas Svedberg

IPC分类号： G10L19/005 , G10L19/00 , G10L19/02 , G10L19/025 , G10L25/45

摘要： In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus thereof for controlling a concealment method for a lost audio frame of a received audio signal. A method for a decoder of concealing a lost audio frame comprises detecting in a property of the previously received and reconstructed audio signal, or in a statistical property of observed frame losses, a condition for which the substitution of a lost frame provides relatively reduced quality. In case such a condition is detected, the concealment method is modified by selectively adjusting a phase or a spectrum magnitude of a substitution frame spectrum.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类