-
公开(公告)号:US11114107B2
公开(公告)日:2021-09-07
申请号:US16593830
申请日:2019-10-04
IPC分类号: G10L19/00 , G10L21/00 , G10L19/20 , G10L19/008 , G10L25/18 , H04S3/00 , G10L19/02 , G10L19/16
摘要: A method for decoding an encoded audio bitstream in an audio processing system is disclosed. The method includes extracting from the encoded audio bitstream a first waveform-coded signal comprising spectral coefficients corresponding to frequencies up to a first cross-over frequency for a time frame and performing parametric decoding at a second cross-over frequency for the time frame to generate a reconstructed signal. The second cross-over frequency is above the first cross-over frequency and the parametric decoding uses reconstruction parameters derived from the encoded audio bitstream to generate the reconstructed signal. The method also includes extracting from the encoded audio bitstream a second waveform-coded signal comprising spectral coefficients corresponding to a subset of frequencies above the first cross-over frequency for the time frame and interleaving the second waveform-coded signal with the reconstructed signal to produce an interleaved signal for the time frame.
-
公开(公告)号:US20210272576A1
公开(公告)日:2021-09-02
申请号:US17255191
申请日:2019-06-20
申请人: Sony Corporation
发明人: Mitsuyuki Hatanaka , Toru Chinen , Minoru Tsuji , Hiroyuki Honma , Yuki Yamamoto
IPC分类号: G10L19/035 , G10L21/00
摘要: The present technology relates to an information processing device and method, and a program capable of reducing a code amount.
The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.-
43.
公开(公告)号:US11051115B2
公开(公告)日:2021-06-29
申请号:US17003039
申请日:2020-08-26
申请人: Olga Sheymov
发明人: Victor Sheymov
IPC分类号: G10L21/00 , H04R25/00 , G10L21/003
摘要: A method and system for improving the quality of audio communications as perceived by humans include audio signal spectrum frequency shift for enhancement of speech recognition by human customers, including mitigation of common age-related hearing loss on high audio frequencies.
-
公开(公告)号:US11031028B2
公开(公告)日:2021-06-08
申请号:US16326956
申请日:2017-06-01
申请人: Sony Corporation
发明人: Keiichi Osako , Yuhki Mitsufuji , Kohei Asada
IPC分类号: G10L21/00 , G10L21/028 , G10L25/30 , G06N20/00 , G06F17/16 , G06N3/08 , G10L21/0308
摘要: [Object] To provide a sound source separation technology capable of improving the separation performance.
[Solution] An information processing apparatus including: an acquisition section configured to acquire an observation signal obtained by observing a sound; and a sound source separation section configured to separate the observation signal acquired by the acquisition section into a plurality of separated signals corresponding to a plurality of assumed sound sources by applying a non-linear function to a matrix product of an input vector and a coefficient vector corresponding to each of the plurality of sound sources.-
公开(公告)号:US11031026B2
公开(公告)日:2021-06-08
申请号:US16219620
申请日:2018-12-13
发明人: Andrew Kostic , Eddie Choy , Dinesh Ramakrishnan
IPC分类号: G10L21/00 , G10L25/84 , G10L25/78 , G10L21/0208 , G10L21/028 , G10L21/0364 , G10L21/0216
摘要: Methods, systems, computer-readable media, and apparatuses for acoustic echo cancellation during playback of encoded audio are presented. In some embodiments, a decoder is arranged to decode an encoded media signal to produce an echo reference signal, and an echo canceler is arranged to perform an acoustic echo cancellation operation, based on the echo reference signal, on an input voice signal to produce an echo-cancelled voice signal. The echo canceler may be configured to reduce, relative to an energy of a voice component of the input voice signal, an energy of a signal component of the input voice signal that is based on audio content from the encoded media signal.
-
公开(公告)号:US11027962B2
公开(公告)日:2021-06-08
申请号:US16603333
申请日:2017-12-01
申请人: LG ELECTRONICS INC.
发明人: Gyuhyeok Jeong
摘要: The present invention relates to a beverage supply apparatus comprising a sensing unit, a microphone, an artificial intelligence unit and a control unit. The control unit: activates the microphone when an object is detected; recognizes first sound data of a user when the first sound data is sensed via the activated microphone; acquires information associated with the user on the basis of the first sound data; stores the acquired information associated with the user; and determines a menu corresponding to the object on the basis of the information associated with the user and the first sound data.
-
公开(公告)号:US11024324B2
公开(公告)日:2021-06-01
申请号:US16628679
申请日:2018-08-22
发明人: Yuanxun Kang
IPC分类号: G10L21/00 , G10L15/22 , G10L21/0216 , G06N3/08 , G10L25/18 , G10L25/30 , G10L25/45 , G10L25/60
摘要: Disclosed herein is a method for RNN-based noise reduction in a real-time conference, comprising: performing frame-and-window for a speech signal to obtain a logarithmic spectrum of the speech signal, and placing the logarithmic spectrum into the RNN model to determine a noise reduction suppression coefficient, and then obtaining the denoised speech signal by applying the noise reduction suppression coefficient to the logarithmic spectrum of the original signal, thereby achieving utilization of the RNN noise reduction method in real-time conferences. In the present disclosure, when inputting the RNN model for estimation, only the logarithmic spectrum of the current frame needs to be inputted. The RNN model of the present disclosure has few requirements on inputted information, without performing huge preprocessing on the received speech signal, which in turn reduces computation burden, increases response speed, and enhances real-time performance.
-
公开(公告)号:US11024306B2
公开(公告)日:2021-06-01
申请号:US16131453
申请日:2018-09-14
申请人: Google LLC
发明人: Gaurav Bhaya , Ulas Kirazci , Bradley Abrams , Adam Coimbra , Ilya Firman , Carey Radebaugh
摘要: The present disclosure is generally directed to the generation of voice-activated data flows in interconnected network. The voice-activated data flows can include input audio signals that include a request and are detected at a client device. The client device can transmit the input audio signal to a data processing system, where the input audio signal can be parsed and passed to the data processing system of a service provider to fulfill the request in the input audio signal. The present solution is configured to conserve network resources by reducing the number of network transmissions needed to fulfill a request.
-
公开(公告)号:US10991377B2
公开(公告)日:2021-04-27
申请号:US16411311
申请日:2019-05-14
IPC分类号: G10L21/00 , G10L21/0208 , G10L15/20 , G06F17/14 , G10L25/84 , G10L21/028
摘要: A mechanism to adjust far-end signal loudness based on environmental noise levels and device speaker characteristics has a noise-level analyzer that receives feedback from an intelligent speaker-boosting logic circuit that provides a signal to a class-D amplifier to drive the speaker. The noise-level analyzer analyzes near-end environmental noise levels and far-end speech input signal levels across critical frequency bands. The noise-level analyzer performs a masking analysis of the far-end and near-end signals, and guides the speaker-boosting logic circuit to apply determined signal boosting levels over selective bands. The speaker-boosting logic circuit monitors system activity along with the selective band boosting guidance from the noise-level analyzer. Using device speaker information and the speaker excursion pattern, the speaker-boosting logic circuit adjusts far-end speech signal loudness without over excursion of the speaker and damage to the speaker hardware.
-
50.
公开(公告)号:US10943593B2
公开(公告)日:2021-03-09
申请号:US16542440
申请日:2019-08-16
IPC分类号: G10L21/00 , G10L19/00 , G10L21/04 , G10L19/087 , G10L25/72 , G10L19/24 , G10L21/038
摘要: A method and device are provided for determining an optimized scale factor to be applied to an excitation signal or a filter during a process for frequency band extension of an audio frequency signal. The band extension process includes decoding or extracting, in a first frequency band, an excitation signal and parameters of the first frequency band including coefficients of a linear prediction filter, generating an excitation signal extending over at least one second frequency band, filtering using a linear prediction filter for the second frequency band. The determination method includes determining an additional linear prediction filter, of a lower order than that of the linear prediction filter of the first frequency band, the coefficients of the additional filter being obtained from the parameters decoded or extracted from the first frequency and calculating the optimized scale factor as a function of at least the coefficients of the additional filter.
-
-
-
-
-
-
-
-
-