-
公开(公告)号:WO2023024501A1
公开(公告)日:2023-03-02
申请号:PCT/CN2022/082305
申请日:2022-03-22
Applicant: 北京百度网讯科技有限公司
IPC: G10H1/02 , G10L21/0272 , G10L21/0308 , G10L13/047 , G10L25/24 , G10L25/30 , G10L25/09 , G10L25/93
Abstract: 一种音频数据处理方法(100),涉及语音合成技术领域,包括:分解原始音频数据,得到人声音频数据和背景音频数据(S110);对人声音频数据进行电音化处理,得到电音人声数据(S120);以及将电音人声数据和背景音频数据合成,得到目标音频数据(S130)。还涉及一种音频数据处理装置(500)、电子设备(600)以及存储介质(608)。
-
公开(公告)号:WO2022034139A1
公开(公告)日:2022-02-17
申请号:PCT/EP2021/072384
申请日:2021-08-11
Applicant: DOLBY INTERNATIONAL AB
Inventor: YEH, Chunghsin , CENGARLE, Giulio , DE BURGH, Mark David
IPC: G10L15/04 , G10L21/0264 , G10L21/034 , G10L25/93 , G10L21/0308 , G10L25/09 , G10L25/21 , G10L25/24 , G10L25/84 , G10L21/0316
Abstract: Described is a method of performing automatic audio enhancement on an input audio signal including at least one speech-articulation noise event. The method comprises: segmenting the input audio signal into a number of audio frames; obtaining at least one feature parameter from the audio frames; and determining, based at least in part on the obtained feature parameter, a respective type of the speech-articulation noise event and a respective time-frequency range associated with the speech-articulation noise event within the input audio signal.
-
公开(公告)号:WO2021234873A1
公开(公告)日:2021-11-25
申请号:PCT/JP2020/019997
申请日:2020-05-20
Applicant: 日本電信電話株式会社
IPC: G10L21/0308
Abstract: 複数の音が混合された混合信号のスペクトログラムと前記スペクトログラムの各時間周波数点について目的の音源が支配的か否かを示す支配音源情報とを取得する学習用データ取得部と、前記スペクトログラムの時間軸方向に区分された1つの区間に属する時間周波数点における1又は複数の値であって前記スペクトログラムに関する1又は複数の値を表す情報であるテンプレートを用いた合成積の推定に用いる重みを推定する重み推定部と、前記合成積に基づき前記支配音源情報の推定結果を取得する支配音源情報推定部と、前記推定結果と前記支配音源情報との違いを取得する損失取得部と、を備え、前記重み推定部は前記違いを小さくするように前記重みを推定する機械学習のモデルを学習する、音源分離モデル学習装置。
-
公开(公告)号:WO2021193093A1
公开(公告)日:2021-09-30
申请号:PCT/JP2021/009764
申请日:2021-03-11
Applicant: ソニーグループ株式会社
Inventor: 廣江 厚夫
IPC: G10L21/0308 , G10L25/30 , G10L25/78
Abstract: 異なる位置に配置されたマイクロホンで収録され、目的音と目的音以外の音とが混合された混合音信号が入力され、混合音信号に基づいて目的音に対応する参照信号を生成する参照信号生成部と、混合音信号から参照信号に類似し、且つ、目的音がより強調された信号を抽出する音源抽出部とを有する信号処理装置である。
-
公开(公告)号:WO2021058856A1
公开(公告)日:2021-04-01
申请号:PCT/FI2020/050592
申请日:2020-09-16
Applicant: NOKIA TECHNOLOGIES OY
Inventor: LAITINEN, Mikko-Ville , RÄMÖ, Anssi
IPC: G10L19/008 , G10L19/20 , G10L21/028 , G10L21/0308 , G10L25/78 , H04S7/00
Abstract: An apparatus comprising means for: receiving multi-channel audio signals (110); identifying (132) at least one audio signal to separate from the multi-channel audio signals (110); separating (133), based on the identified at least one audio signal, the multiple audio signals into at least a first sub-set (111) of audio signals and a second sub-set (112) of audio signals, wherein the first sub-set (111) comprises the identified at least one audio signal and the second sub-set (112) comprises the remaining audio signals of the received multi-channel audio signals (110); analyzing (152) the remaining audio signals of the second sub-set (112) of audio signals to determine one or more transport audio signals (151) and metadata (153); and encoding (140, 154) the at least one audio signal, transport audio signal (151) and metadata (153).
-
6.
公开(公告)号:WO2020084787A1
公开(公告)日:2020-04-30
申请号:PCT/JP2018/039997
申请日:2018-10-26
Applicant: NEC CORPORATION
Inventor: NARISETTY Chaitanya Prasad , KOMATSU Tatsuya , KONDO Reishi
IPC: G10L21/0308
Abstract: A purpose of the present disclosure is to provide a source separation method, a non-transitory computer readable medium, and a source separation apparatus. The source separation apparatus (100) includes an input means (101) for inputting mixture data obtained by mixing a plurality of data; and a matrix decomposition means (102) for separating the input mixture data by estimating a mixing/unmixing matrix (1021), a basis matrix for each source (10221), an activations matrix for each source (10222) and a reliability vector for each source (10223), and a means for unmixing of input mixture data (103) using the estimated matrices from the matrix decomposition means to estimate the sources.
-
公开(公告)号:WO2018174310A1
公开(公告)日:2018-09-27
申请号:PCT/KR2017/003055
申请日:2017-03-22
Applicant: 삼성전자 주식회사
IPC: G10L15/20 , G10L21/0208 , G10L21/0308 , G10L21/0364 , G10L25/18 , G10L15/04
Abstract: 상기 기술적 과제를 해결하기 위한 본 발명의 일 실시예에 따른 음성 신호 처리 방법은, 적어도 하나의 마이크로폰을 이용해 수신단(near-end) 잡음 신호 및 수신단 음성 신호를 획득하는 단계; 착신 호(incoming call)에 따른 송신단(far-end) 음성 신호를 획득하는 단계; 수신단 음성 신호에 대한 정보, 수신단 잡음 신호에 대한 정보 및 송신단 음성 신호에 대한 정보 중 적어도 하나에 기초하여, 잡음 제어 파라미터 및 음성 신호 변경 파라미터를 결정하는 단계; 잡음 제어 파라미터에 기초하여, 수신단 잡음 신호의 역위상 신호를 생성하는 단계; 음성 신호 변경 파라미터, 수신단 잡음 신호, 역위상 신호 및 에러 신호에 기초하여, 송신단 음성 신호의 명료도가 개선되도록 송신단 음성 신호를 변경하는 단계; 및 역위상 신호 및 변경된 송신단 음성 신호를 출력하는 단계;를 포함한다.
-
公开(公告)号:WO2018125308A1
公开(公告)日:2018-07-05
申请号:PCT/US2017/049926
申请日:2017-09-01
Applicant: GOOGLE LLC
Inventor: KLEIJN, Willem Bastiaan , LIM, Sze Chie
IPC: G10L21/028 , G10L21/0308
Abstract: A method includes: receiving time instants of audio signals generated by a set of microphones at a location; determining a distortion measure between frequency components of at least some of the received audio signals; determining a similarity measure for the frequency components using the determined distortion measure; and processing the audio signals based on the determined similarity measure.
-
公开(公告)号:WO2017061023A1
公开(公告)日:2017-04-13
申请号:PCT/JP2015/078708
申请日:2015-10-09
Applicant: 株式会社日立製作所
IPC: G10L21/0308 , G10L21/0272 , H04R3/00
CPC classification number: G10L21/0272 , G10L21/0308 , H04R3/00
Abstract: 複数のデバイスが非同期で収録した音を入力とする場合であっても、各音源の音を分離する音声信号処理装置および方法を提供することにある。 複数のデバイスごとに、異なる周波数の参照信号を出力するよう指示し、前記指示に応じて、前記複数のデバイスのスピーカから出力された各参照信号を受信し、前記複数のデバイスのスピーカから出力された各参照信号が、前記複数のデバイスのマイクに入力された音声信号を受信し、前記受信した前記スピーカから出力された各参照信号と、前記受信した音声信号とから、前記デバイスごとの時間シフト量を算出し、前記算出された時間シフト量に基づいて、前記複数のデバイスのマイクに入力された複数の音声信号を分離する。
Abstract translation: 本发明提供了一种音频信号处理装置和方法,即,即使输入了由多个装置异步记录的声音,也能分离各声源的声音。 音频信号处理装置指示多个装置中的每一个输出不同频率的参考信号,根据指令接收分别从多个装置的扬声器输出的参考信号,接收分别具有参考信号的音频信号 从多个装置的扬声器的输出已经被输入到多个装置的麦克风,从已经分别从扬声器和接收到的音频信号输出的接收到的参考信号中计算各个装置的时移量,并且分离多个 基于所计算的时间偏移量,输入到多个装置的麦克风的音频信号。
-
公开(公告)号:WO2017005978A1
公开(公告)日:2017-01-12
申请号:PCT/FI2016/050494
申请日:2016-07-05
Applicant: NOKIA TECHNOLOGIES OY
Inventor: LAITINEN, Mikko-Ville , TAMMI, Mikko , VILERMO, Miikka
IPC: H04R1/40 , H04R3/00 , H04R5/027 , H04R23/02 , G01S3/808 , G10L19/008 , G10L21/0308 , G06F3/16
CPC classification number: H04R5/027 , H04R1/406 , H04R3/005 , H04S7/30 , H04S2400/15 , H04S2420/01
Abstract: Apparatus comprising: an audio capture application configured to determine separate microphones from a plurality of microphones and identify a sound source direction of at least one audio source within an audio scene by analysing respective two or more audio signals from the separate microphones, wherein the audio capture application is further configured to adaptively select, from the plurality of microphones, two or more respective audio signals based on the determined direction and furthermore configured to select, from the two or more respective audio signals, a reference audio signal also based on the determined direction; and a signal generator configured to generate a mid signal representing the at least one audio source based on a combination of the selected two or more respective audio signals and with reference to the reference audio signal.
Abstract translation: 一种装置,包括:音频捕获应用,被配置为从多个麦克风中确定单独的麦克风,并且通过分析来自所述单独麦克风的相应的两个或更多个音频信号来识别音频场景内的至少一个音频源的声源方向,其中所述音频捕获 应用还被配置为基于所确定的方向从多个麦克风自适应地选择两个或更多个相应的音频信号,并且还被配置为从两个或更多个相应的音频信号中选择参考音频信号,其也基于所确定的方向 ; 以及信号发生器,被配置为基于所选择的两个或多个相应音频信号的组合并且参考参考音频信号来生成表示所述至少一个音频源的中间信号。
-
-
-
-
-
-
-
-
-