-
公开(公告)号:WO2021196905A1
公开(公告)日:2021-10-07
申请号:PCT/CN2021/076465
申请日:2021-02-10
Applicant: 腾讯科技(深圳)有限公司
IPC: G10L21/0208 , G10L21/02 , G10L25/12 , G10L25/18 , G10L2021/02082 , G10L21/0324
Abstract: 一种语音信号去混响处理方法、处理装置、计算机设备及可读存储介质。该方法包括:获取原始语音信号(S502);提取原始语音信号中当前帧的幅度谱特征和相位谱特征(S504);提取幅度谱特征的子带幅度谱,将子带幅度谱输入至第一混响预测器,输出当前帧对应的混响强度指标(S506);利用第二混响预测器根据子带幅度谱和混响强度指标,确定当前帧的纯净语音子带谱(S508);对纯净语音子带谱和相位谱特征进行信号转换,得到去混响后的纯净语音信号(S510)。
-
公开(公告)号:WO2023273747A1
公开(公告)日:2023-01-05
申请号:PCT/CN2022/095732
申请日:2022-05-27
Applicant: 青岛海尔科技有限公司 , 海尔智家股份有限公司
Inventor: 郝斌
IPC: G10L21/0208 , G10L15/22 , G10L25/54 , G10L2015/223 , G10L2021/02082
Abstract: 一种智能设备的唤醒方法和装置、存储介质及电子装置,其中,该方法包括:从多个智能设备中获取允许被唤醒信号唤醒的智能设备作为候选设备;在候选设备的数量为多个的情况下,确定多个候选设备中每个候选设备对应的目标唤醒角度以及目标唤醒能量;根据目标唤醒角度和目标唤醒能量,从多个候选设备中确定目标设备,其中,目标设备用于响应唤醒信号。解决了相关技术中,确定响应唤醒指令的智能设备的准确性较低等问题。
-
3.
公开(公告)号:WO2021252116A1
公开(公告)日:2021-12-16
申请号:PCT/US2021/031461
申请日:2021-05-09
Applicant: FACEBOOK TECHNOLOGIES, LLC
Inventor: BINGHAM, Joshua Warren , ZENG, Yuhuan , IVANOV, Plamen Alexandrov , BACK, Tyler , EVANS, Christopher , NILSSON, Jens , ASFAW, Michael
IPC: H04M3/56 , H04M9/08 , H04R3/00 , G10K11/17854 , G10L2021/02082 , G10L2021/02166 , G10L21/0208 , H04M2203/509 , H04M3/568 , H04M9/082 , H04R1/403 , H04R1/406 , H04R2201/401 , H04R2499/15 , H04S7/303
Abstract: An electronic device includes a microphone array to capture audio input data, a speaker array to render audio output data for playback; one or more sensors to detect an orientation of the microphone array, acoustic echo cancellation logic, and an interface. The acoustic echo cancellation logic applies acoustic echo cancellation to the audio input data to form echo-cancelled audio input data based on the orientation of the microphone array. The interface transmits the echo-cancelled audio input data over a communications channel as part of an audiovisual communication system.
-
公开(公告)号:WO2023059402A1
公开(公告)日:2023-04-13
申请号:PCT/US2022/040979
申请日:2022-08-22
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC.
Inventor: ESKIMEZ, Sefik Emre , YOSHIOKA, Takuya , WANG, Huaming , TAHERIAN, Hassan , CHEN, Zhuo , HUANG, Xuedong
IPC: G10L21/0208 , G10L21/0272 , G10L2021/02082 , G10L2021/02087
Abstract: Examples of array geometry agnostic multi-channel personalized speech enhancement (PSE) extract speaker embeddings, which represent acoustic characteristics of one or more target speakers, from target speaker enrollment data. Spatial features (e.g., inter-channel phase difference) are extracted from input audio captured by a microphone array. The input audio includes a mixture of speech data of the target speaker(s) and one or more interfering speaker(s). The input audio, the extracted speaker embeddings, and the extracted spatial features are provided to a trained geometry-agnostic PSE model. Output data is produced, which comprises estimated clean speech data of the target speaker(s) that has a reduction (or elimination) of speech data of the interfering speaker(s), without the trained PSE model requiring geometry information for the microphone array.
-
公开(公告)号:WO2021252039A1
公开(公告)日:2021-12-16
申请号:PCT/US2021/022008
申请日:2021-03-11
Applicant: GOOGLE LLC
Inventor: WANG, Quan
IPC: G10L21/0208 , G10L25/30 , G10L13/02 , G10L21/0216 , G10L21/0264 , G10L13/00 , G10L15/063 , G10L2021/02082 , G10L25/93
Abstract: A method (400) includes receiving an overlapped audio signal (202) that includes audio spoken by a speaker (10) that overlaps a segment (156) of synthesized playback audio (154). The method also includes encoding a sequence of characters that correspond to the synthesized playback audio into a text embedding representation (212). For each character in the sequence of characters, the method also includes generating a respective cancelation probability (222) using the text embedding representation. The cancelation probability indicates a likelihood that the corresponding character is associated with the segment of the synthesized playback audio overlapped by the audio spoken by the speaker in the overlapped audio signal.
-
公开(公告)号:WO2021244826A1
公开(公告)日:2021-12-09
申请号:PCT/EP2021/062372
申请日:2021-05-10
Applicant: RENAULT S.A.S
Inventor: MARTIN, Hervé , MENDES-CARVALHO, Jose
IPC: G10L15/08 , G10L15/25 , G10L25/78 , H03G7/00 , H04S7/00 , G06F3/01 , G10L21/0208 , G06K9/00 , G06F3/012 , G06V40/20 , G10L2021/02082 , H03G3/3005 , H03G3/32 , H03G3/342 , H04S7/305
Abstract: L'invention concerne un procédé de contrôle de volume sonore généré par un haut-parleur (HP) dans une cabine, caractérisé en ce qu'il comprend: - une première étape d'acquisition du son (Sb) dans la cabine, - une deuxième étape de filtrage par annulation dans le son (Sb) acquis à la première étape du son (Sg) généré par le haut-parleur (HP),- une troisième étape de classification de situation sonore (Css) cabine à partir du son filtré (Sn) à la deuxième étape, - une quatrième étape de détermination d'une intensité de communication visuelle (Icv) dans la cabine, - une cinquième étape de contrôle du volume sonore généré par le haut-parleur (HP) en fonction de la situation sonore classifiée (Css) à la troisième étape et de l'intensité de communication visuelle (Icv) déterminée à la quatrième étape.
-
公开(公告)号:WO2021190274A1
公开(公告)日:2021-09-30
申请号:PCT/CN2021/079181
申请日:2021-03-05
Applicant: 紫光展锐(重庆)科技有限公司
Inventor: 叶顺舟
IPC: H04M7/00 , G10L21/0208 , G10L2021/02082 , H04M7/006
Abstract: 一种回声声场状态确定方法及装置、存储介质、终端,所述方法包括:获取待确定信号;确定所述待确定信号的远端信号X n(k)、近端信号D n(k)以及滤波器系数W n(k);至少根据所述远端信号X n(k)、近端信号D n(k)以及滤波器系数W n(k),确定滤波器更新度Cef update;至少根据滤波器更新度Cef update大于预设更新度阈值Thrd update,确定所述待确定信号的回声声场状态是否为回声路径变化状态。本发明可以有效提高对回声路径变化状态判断的准确性,并且有机会采用更多参数对更多回声声场状态进行判断,更有效地实现多特征检测,提高对回声声场状态判断的完整性。
-
公开(公告)号:WO2021119214A2
公开(公告)日:2021-06-17
申请号:PCT/US2020/064135
申请日:2020-12-09
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: PORT, Timothy Alan , TEMPLETON, Daniel Steven , HAYS, Jack Gregory
IPC: H04S7/00 , G10L21/0208 , H03G3/32 , G10L21/0216 , H04R3/04 , H04R3/02 , H04R1/10 , H03G5/02 , H03G5/16 , H03G9/00 , H03G9/02 , H04M9/08 , H04R3/00 , G10L2021/02082 , H03G9/005 , H03G9/025 , H04M9/082 , H04R1/1083 , H04R2227/001 , H04R2430/01 , H04R2430/03 , H04R3/007 , H04S2400/13 , H04S7/30 , H04S7/301 , H04S7/302 , H04S7/305
Abstract: Some implementations involve receiving a content stream that includes audio data, determining a content type corresponding to the content stream and determining, based at least in part on the content type, a noise compensation method. Some examples involve performing the noise compensation method on the audio data to produce noise-compensated audio data, rendering the noise-compensated audio data for reproduction via a set of audio reproduction transducers of the audio environment, to produce rendered audio signals, and providing the rendered audio signals to at least some audio reproduction transducers of the audio environment.
-
公开(公告)号:WO2020123835A1
公开(公告)日:2020-06-18
申请号:PCT/US2019/066025
申请日:2019-12-12
Applicant: QUALCOMM INCORPORATED
Inventor: KOSTIC, Andrew , CHOY, Eddie , RAMAKRISHNAN, Dinesh
IPC: G10L21/0208 , H04B3/23 , G10L2021/02082 , G10L2021/02166 , G10L21/028 , G10L21/0364 , H04M9/082
Abstract: Methods, systems, computer-readable media, and apparatuses for acoustic echo cancellation during playback of encoded audio are presented. In some embodiments, a decoder is arranged to decode an encoded media signal to produce an echo reference signal, and an echo canceller is arranged to perform an acoustic echo cancellation operation, based on the echo reference signal, on an input voice signal to produce an echo-cancelled voice signal. The echo canceller may be configured to reduce, relative to an energy of a voice component of the input voice signal, an energy of a signal component of the input voice signal that is based on audio content from the encoded media signal.
-
公开(公告)号:WO2023059655A1
公开(公告)日:2023-04-13
申请号:PCT/US2022/045694
申请日:2022-10-04
Applicant: SHURE ACQUISITION HOLDINGS, INC.
Inventor: CANFIELD, Gregory H.
IPC: H04R1/40 , H04R3/00 , H04M9/08 , H04R3/02 , H04R27/00 , G10L21/0208 , G06F3/16 , G10L2021/02082 , G10L2021/02166 , G10L21/0316 , G10L25/18 , H04M2203/509 , H04M3/568 , H04M9/082 , H04R1/406 , H04R2227/001 , H04R2227/007 , H04R2227/009 , H04R2420/01 , H04R3/005 , H04R3/04 , H04R5/04
Abstract: Systems and methods are disclosed for networked audio automixing using array microphones and an aggregator unit that participate in making a common gating decision to determine which channels to gate on and off. Through the use of such a network of array microphones having the capability to generate submix audio signals and reduced bandwidth metrics, as well as AEC processing capability, array microphone lobe selection can be enhanced while maximizing signal -to-noise ratio, increasing intelligibility, and increasing user satisfaction.
-
-
-
-
-
-
-
-
-