-
公开(公告)号:US12119015B2
公开(公告)日:2024-10-15
申请号:US17649362
申请日:2022-01-30
发明人: Jinbo Zheng , Fengyun Liao , Xin Qi
IPC分类号: G10L21/0216 , G10L25/18
CPC分类号: G10L21/0216 , G10L25/18
摘要: The present disclosure provides systems and methods for processing a signal. The system for processing a signal may include at least one microphone and at least one vibration sensor. The at least one microphone may be configured to collect a sound signal, and the sound signal may include at least one of user voice and environmental noise. The at least one vibration sensor may be configured to collect a vibration signal, and the vibration signal may include at least one of the user voice and the environmental noise. The system for processing a signal may also comprise a processor. The processor may be configured to determine a relationship between a noise component in the sound signal and a noise component in the vibration signal, and obtain a target vibration signal by performing, based at least on the relationship, noise reduction processing on the vibration signal.
-
公开(公告)号:US12119014B2
公开(公告)日:2024-10-15
申请号:US17644108
申请日:2021-12-14
申请人: Google LLC
发明人: Arun Narayanan , Tom O'malley , Quan Wang , Alex Park , James Walker , Nathan David Howard , Yanzhang He , Chung-Cheng Chiu
IPC分类号: G10L21/0216 , G06N3/04 , G10L15/06 , G10L21/0208 , H04R3/04
CPC分类号: G10L21/0216 , G06N3/04 , G10L15/063 , H04R3/04 , G10L2021/02082
摘要: A method for automatic speech recognition using joint acoustic echo cancellation, speech enhancement, and voice separation includes receiving, at a contextual frontend processing model, input speech features corresponding to a target utterance. The method also includes receiving, at the contextual frontend processing model, at least one of a reference audio signal, a contextual noise signal including noise prior to the target utterance, or a speaker embedding including voice characteristics of a target speaker that spoke the target utterance. The method further includes processing, using the contextual frontend processing model, the input speech features and the at least one of the reference audio signal, the contextual noise signal, or the speaker embedding vector to generate enhanced speech features.
-
公开(公告)号:US20240312473A1
公开(公告)日:2024-09-19
申请号:US18675981
申请日:2024-05-28
发明人: Craig FRASER , Daniel DAVIES , John HORSTMANN , Lars CHRISTENSEN
IPC分类号: G10L21/0216 , H04R1/10
CPC分类号: G10L21/0216 , H04R1/1083 , G10K2210/1081 , H04R2460/01
摘要: A method of real-time noise reduction including generating spectral data using temporally localized spectral representations of a received audio signal, determining detection of voice by comparing first and second filtered data, and generating noise-reduced audio output by attenuating noise based on the determined detection of voice. The first and second filtered data are formed by attenuating temporal variations of the spectral data based on, respectively, a first timescale and a second timescale. A noise reduction system, comprising processing circuitry configured to execute a method of real-time noise reduction to generate an output that is transmitted via an output port of the noise reduction system. A noise-reduction microphone comprising a housing having a transducer coupled to a processor therein to execute a method of real-time noise reduction, and an output port. A non-transitory computer-readable medium having instructions to cause a processor to perform a method of real-time noise reduction.
-
公开(公告)号:US12089032B1
公开(公告)日:2024-09-10
申请号:US17308778
申请日:2021-05-05
申请人: Apple Inc.
IPC分类号: G01H7/00 , G10L21/0216 , H04R3/00 , H04S7/00
CPC分类号: H04S7/305 , G01H7/00 , G10L21/0216 , H04R3/005 , G10L2021/02166 , H04R2203/12
摘要: Acoustic pickup beams (sound beams) can be formed in a physical environment from a plurality of microphone signals. Each of the sound beams can measure acoustic energy in a direction of the respective sound beam. Directional decay of the acoustic energy measured through each of the sound beams is determined. Room surface acoustic properties of the physical environment are determined based on mapping the directional decay of the acoustic energy to the physical environment. Other aspects are described and claimed.
-
公开(公告)号:US20240290339A1
公开(公告)日:2024-08-29
申请号:US18586187
申请日:2024-02-23
IPC分类号: G10L21/0216 , G10L25/30
CPC分类号: G10L21/0216 , G10L25/30
摘要: The present subject matter provides a method for de-noising an audio visual speech. The method includes modeling a noise in the audio visual speech using a noisy speech from audio data associated with the audio visual speech to generate a reconstructed noise signal. The method includes estimating the reconstructed noise signal in the audio visual speech using an audio signal and a plurality of visual frames. The method includes partitioning the reconstructed noise signal into a plurality of windows and calculate an energy associated with each window. The method includes estimating a noise strength in each window by performing a soft max operation to obtain one or more refined audio features. The method includes fusing the one or more refined audio features and one or more visual features using the noise strength to generate an output that is passed through a decoder to obtain a de-noised audio visual speech.
-
公开(公告)号:US12027171B2
公开(公告)日:2024-07-02
申请号:US17402991
申请日:2021-08-16
申请人: 105 Publishing LLC
IPC分类号: G10L15/26 , G06F16/635 , G06F16/68 , G06F16/683 , G06V30/32 , G10L21/0216 , G06V30/10
CPC分类号: G10L15/26 , G06F16/635 , G06F16/685 , G06F16/686 , G06V30/32 , G10L21/0216 , G06V30/10
摘要: As an example, a server may receive, from a computing device, a submission created by an author. The submission includes book data associated with a book and author data associated with the author. The author data includes incarceration data indicating whether the author was incarcerated. The server may determine, based on the author data and the book data, that the submission is publishable. The server may create, based on the book data, a printable book, an e-book, and an audio book and make one or more of the printable book, the e-book, and the audio book available for acquisition.
-
公开(公告)号:US20240205631A1
公开(公告)日:2024-06-20
申请号:US18590112
申请日:2024-02-28
IPC分类号: H04S7/00 , G10L19/008 , G10L21/0216
CPC分类号: H04S7/303 , G10L19/008 , G10L21/0216 , G10L2021/02166 , H04S2400/11
摘要: An apparatus configured to: identify a POI in an audio scene, wherein one or more input audio signals represent the audio scene and at least one further input audio signal represents at least part of the audio scene, wherein the POI comprises a portion of the audio scene to be replaced during rendering using reconstructed audio parameters; generate, from the at least one further input audio signal, one or more complementary audio parameters that represent the POI in the audio scene; process the one or more input audio signals to obtain input audio parameters so as to enable replacement of the POI using the one or more complementary audio parameters; and combine the one or more complementary audio parameters with the input audio parameters, to create reconstructed audio parameters, so as to replace the POI in the audio scene at least partially using the one or more complementary audio parameters.
-
公开(公告)号:US20240144949A1
公开(公告)日:2024-05-02
申请号:US18493555
申请日:2023-10-24
发明人: Xiao Yang , Ahmed Kamal Atwa Mohamed , Charles Ye , Nikita Bhalla , Shashank Jain , Mahek Parvez Hooda , Gagan Aneja , Stanislav Peshterliev , Pranab Mohanty , Gerald Eugene McAlister , Gautam Venkatesan , Ju Lin , Ruiming Xie , Niko Moritz , Frank Torsten Bernd Seide
IPC分类号: G10L21/0216 , G06F40/58 , G10L17/02 , G10L17/04 , G10L17/14 , H04R3/00 , H04R5/027 , H04S3/00 , H04S7/00
CPC分类号: G10L21/0216 , G06F40/58 , G10L17/02 , G10L17/04 , G10L17/14 , H04R3/005 , H04R5/027 , H04S3/008 , H04S7/302 , G10L2021/02087 , H04R2499/15 , H04S2400/01 , H04S2400/11 , H04S2400/15
摘要: In one embodiment, an AR/VR system includes a social-networking application installed on the AR/VR system, which allows a user to access on online social network, including communicating with the user's social connections and interacting with content objects on the online social network. The AR/VR system also includes an AR/VR application, which allows the user to interact with an AR/VR platform by providing user input to the AR/VR application via various modalities. Based on the user input, the AR/VR platform generates responses and sends the generated responses to the AR/VR application, which then presents the responses to the user at the AR/VR system via various modalities.
-
公开(公告)号:US11961532B2
公开(公告)日:2024-04-16
申请号:US18106251
申请日:2023-02-06
申请人: Google LLC
发明人: Steve Rui , Govind Kannan , Trausti Thormundsson
IPC分类号: G10L21/0216 , G10L21/0232 , G10L25/84 , H04R1/10 , H04R1/40 , H04R3/00
CPC分类号: G10L21/0216 , G10L25/84 , H04R1/1083 , G10L2021/02166 , H04R2460/13
摘要: Systems and methods for enhancing a headset user's own voice include at least two outside microphones, an inside microphone, audio input components operable to receive and process the microphone signals, a voice activity detector operable to detect speech presence and absence in the received and/or processed signals, and a cross-over module configured to generate an enhanced voice signal. The audio processing components includes a low frequency branch comprising low pass filter banks, a low frequency spatial filter, a low frequency spectral filter and an equalizer, and a high frequency branch comprising highpass filter banks, a high frequency spatial filter, and a high frequency spectral filter.
-
公开(公告)号:US20240119935A1
公开(公告)日:2024-04-11
申请号:US18539838
申请日:2023-12-14
发明人: Timothy J. Receveur , Dan R. Tallent , Richard J. Schuman , Eric D. Agdeppa , John S. Schroder , Catherine Infantolino
CPC分类号: G10L15/22 , G10L17/22 , G10L21/0216 , G16H40/67 , H04R3/005 , H04R5/02 , G10L2015/223 , G10L2021/02166 , H04R2203/12
摘要: Systems for voice control of medical devices in a healthcare facility are disclosed herein. The systems employ continuous speech processing software, voice recognition software, natural language processing software, and other software to permit voice control of the medical devices. Systems are also provided for distinguishing which medical device from among multiple medical devices in a patient room is the particular medical device to be controlled by voice input from a caregiver or a patient.
-
-
-
-
-
-
-
-
-